FSS 4046

pp: 1--11 (col.fig.: Nil)

PROD. TYPE: COM

ED: DK PAGN: Suresh -- SCAN: VKumar

ARTICLE IN PRESS

Fuzzy Sets and Systems

(

)

–

www.elsevier.com/locate/fss

1

Support vector fuzzy regression machines Dug Hun Honga;∗ , Changha Hwangb a

F

3

School of Mechanical and Automotive Engineering, Catholic University of Daegu, 330 Keumrak 1-ri Hayang-up Kyongsan, Kyungbuk 712 - 702, South Korea b Department of Statistical Information, Catholic University of Daegu, Kyungbuk 712 - 702, South Korea

7

Received 3 August 2001; received in revised form 9 October 2001; accepted 28 October 2002

O

O

5

13

D

11

Support vector machine (SVM) has been very successful in pattern recognition and function estimation problems. In this paper, we introduce the use of SVM for multivariate fuzzy linear and nonlinear regression models. Using the basic idea underlying SVM for multivariate fuzzy regressions gives computational e2ciency of getting solutions. c 2002 Published by Elsevier Science B.V.

TE

9

PR

Abstract

23 25 27

R

R

O

21

C

19

Linear regression models are widely used today in business, administration, economics, engineering, as well as in many other traditionally nonquantitative 9elds such as social, health, and biological sciences. In all cases of fuzzy regression, the linear regression is recommended for practical situations when decisions often have to be made on the basis of imprecise and/or partially available data. Many di:erent fuzzy regression approaches have been proposed. Fuzzy regression, as 9rst developed by Tanaka et al. [14] in a linear system, is based on the extension principle. Tanaka et al. [14] initially applied their fuzzy linear regression procedure to nonfuzzy experimental data. In the experiments that followed this pioneering e:ort, Tanaka et al. [15] used fuzzy input experimental data to build fuzzy regression models. Fuzzy input data used in these experiments were given in the form of triangular fuzzy numbers. The process is explained in more detail by Dubois and Prade [10]. A technique for linear least-squares 9tting of fuzzy variable was developed by Diamond [7,8] giving the solution to an analog of the normal equation of classical least squares. A collection of recent papers dealing with several approaches to fuzzy regression analysis can be found [12].

N

17

1. Introduction

U

15

EC

Keywords: Fuzzy inference systems; Support vector machine; Fuzzy regression; Fuzzy system models

∗

Corresponding author. Tel.: +82-53-850-2712; fax: +82-53-850-2799. E-mail address: [email protected] (D.H. Hong).

c 2002 Published by Elsevier Science B.V. 0165-0114/02/$ - see front matter PII: S 0 1 6 5 - 0 1 1 4 ( 0 2 ) 0 0 5 1 4 - 6

FSS 4046

ARTICLE IN PRESS 2

21 23

25 27 29 31 33 35 37

F

O

O

PR

19

D

17

TE

15

2. SVM fuzzy linear regression

EC

13

R

11

In this section, we will modify the underlying idea of SVM for the purpose of deriving the convex optimization problems for multivariate fuzzy linear regression models. In Section 3 we will consider multivariate fuzzy nonlinear regression model for numerical inputs and fuzzy output. The basic idea of SVM gives computational e2ciency in 9nding solutions of fuzzy regression models particularly for multivariate case. Here, we consider three di:erent models for multivariate fuzzy linear regression. To do this we need some preliminaries. Let X = (m; ; ) be a triangular fuzzy numbers when m is the model value of X and and are the left and right spreads respectively. If = , we can write X = (m; ). On the space T (R) of all triangular fuzzy numbers we use the metric d de9ned by

R

9

In contrast to fuzzy linear regression, there have been only a few articles on fuzzy nonlinear regression. What researchers in fuzzy nonlinear regression were concerned with was data of the form with crisp inputs and fuzzy output. However, some papers, for example [2,3,5], were concerned with the data set with fuzzy inputs and fuzzy output. By the way, in this paper we will treat fuzzy nonlinear regression for data of the form with crisp inputs and fuzzy output. In this paper, we discuss multivariate fuzzy linear and nonlinear regression by support vector machine (SVM). SVM has been recently introduced for solving pattern recognition and function estimation problems [1,4,6,9,11,13,16–18]. SVM is a nonlinear generalization of the Generalized Portrait algorithm developed in Russia in the 1960s. In its present form, the SVM was developed at AT&T Bell Laboratories by Vapnik and co-workers. Due to this industrial context, SVM research has up to date had a sound orientation towards real-world applications. SVM learning has now evolved into an active area of research. Moreover, it is in the process of entering the standard methods toolbox of machine learning. SVM is based on the idea of structural risk minimization, which shows that the generalization error is bounded by the sum of the training set and a term depending on the Vapnik–Chervonenkis dimension. By minimizing this bound, high generalization performance can be achieved. Moreover, unlike other machine learning methods, SVMs generalization error is not related to the problem’s input dimensionality. This explains why SVM can have good performance even in high dimensional problem. The main di:erence between our SVM approach and the nonlinear approaches by Buckley et al. [2,3] and Celmins [5] is not crisp input–fuzzy output versus fuzzy input–fuzzy output, but model-free versus model-dependent. The rest of this paper is organized as follows. Section 2 illustrates the SVM regression procedures for fuzzy multivariate linear models. Section 3 describes how to apply this idea to the fuzzy multivariate nonlinear model with numerical inputs and fuzzy output. Section 4 gives some conclusions.

O

7

–

C

5

)

N

3

(

U

1

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

d(X; Y )2 = (mX − mY )2 + ((mX − X ) − (mY − Y ))2 + ((mX + X ) − (mY + Y ))2 ;

where X = (mX ; X ; X ) and Y = (mY ; Y ; Y ) are any two vectors of triangular fuzzy numbers in T (R). A linear structure is de9ned on T (R) by (mX ; X ; X ) + (mY ; Y ; Y ) = (mX + mY ; X + Y ; X + Y ); t(m; ; ) = (tm; t; t) if t¿0, and t(m; ; ) = (tm; t; t) if t¡0.

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

9

2.1. Case of fuzzy inputs and fuzzy output Suppose that observations consist of data pairs (X i ; Yi ); i = 1; : : : ; l, where X i = ((mXi1 ; Xi1 ; Xi1 ); : : : ; (mXid ; Xid ; Xid ))∈T (R) d and Yi = (mYi ; Yi ; Yi )∈T (R). Here T (R) and T (R)d are the set of triangular fuzzy numbers and the set of d-vectors of triangular fuzzy numbers, respectively. Let mX i = (mXi1 ; : : : ; mXid ), Xi = (Xi1 ; : : : ; Xid ) and Xi = (Xi1 ; : : : ; Xid ). Now, we consider the following two models M 1 and M 2: M 1 : f(X) = B + w; X;

B ∈ T (R);

M 2 : f(X) = b + w; X;

b ∈ R;

w ∈ Rd ;

w ∈ Rd :

F

7

3

For computational simplicity we assume X i ; Yi and B are symmetric triangular fuzzy numbers. Then model M 1 can be rewritten as

O

5

–

O

3

)

M 1 : f(X) = (w; mX + mB ; |w|; X + B );

PR

1

(

where |w| = (|w1 |; |w2 |; : : : ; |wd |). Now, we are going to consider how to get solutions for two models M 1 and M 2. In fact, the model M 2 is straightforward from the model M 1. Model M 1: We arrive at the following convex optimization problem for the model M 1 by modifying the basic idea of SVM for crisp linear regression [11,13,16–18].

D

13

minimize

TE

11

2 l 1 2 w + C (ki + ∗ki ) 2 i=1

EC

k=1

R

mYi − w; mXi − mB 6 + 1i ; w; mXi + mB − mYi 6 + ∗1i ; (mYi − Yi ) − (w; mXi + mB − |w|; Xi − B ) 6 + 2i ; (w; mXi + mB − |w|; Xi − B ) − (mYi − Yi ) 6 + ∗2i ; ki ; ∗ki ¿ 0; k = 1; 2:

(1)

O

R

subject to

C

N

U

15

Then, constructing a Lagrange function and di:erentiating it with regard to mB ; B ; w; ki and ∗ki ; k = 1; 2, we can derive the corresponding dual optimization problem for model M 1 as follows:

maximize

subject to

2 l 2 1 (ki + ki∗ ) − 2 w − k=1 i=1 l i=1

+

l

i=1

∗ mYi (1i − 1i )+

l

i=1

∗ (mYi − Yi )(2i − 2i )

(ki − ki∗ ) = 0; ki ; ki∗ ∈ [0; C]; k = 1; 2;

FSS 4046

ARTICLE IN PRESS 4

1

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

l

∗ (1i − 1i )mXi +

i=1

l

∗ (2i − 2i )(mXi − |Xi |):

i=1

Solving the above equation under constraints determines the Lagrange multipliers ki ; ki∗ , and the optimal regression function is given by f(X) =

l

∗

(1i − 1i )mXi ; X +

l

(2)

i=1

F

i=1

∗ (2i − 2i )mXi − |Xi |; X + B:

Now we need to 9nd mB and B . By Karush–Kuhn–Tucker (KKT) conditions, we can compute mB as follows: ( + 1i − mYi + w; mXi + mB ) = 0; 1i ∗ ( + ∗1i + mYi − w; mXi − mB ) = 0; 1i (C − ) = 0; (C − ∗ )∗ = 0 1i

and hence mB = mYi − w; mXi −

for 1i ∈ (0; C);

mB = mYi − w; mXi +

∗ for 1i ∈ (0; C):

TE

11

1i

To 9nd B we need to solve the optimization problem given below minimize B ¿ 0

l

{|mYi − Yi − (w; mXi + mB − w; Xi − B )| };

EC

9

1i

D

1i

PR

O

7

–

O

5

)

where w=

3

(

i=1

where -insensitive loss function || is de9ned by 0 if || 6 ; || = || − otherwise:

15

Model M 2: Since the solution for the model M 2 is straightforward from the model M 1, we can easily get solutions as follows. Here the objective function to be maximized is the same as the one for the model M 1 but constraints are a little bit di:erent. We can compute w, mB and B in the same way as the one in the model M 1 2 l 2 1 − w − (ki + ki∗ ) 2 k=1 i=1 maximize l l + mY (1i − ∗ ) + (mY − Y )(2i − ∗ )

C

U

N

17

O

R

R

13

i=1

19

subject to

l 2 k=1 i=1

i

1i

i=1

i

i

(ki − ki∗ ) = 0 and ki ; ki∗ ∈ [0; C]; k = 1; 2:

2i

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

5

Table 1 Fuzzy input–fuzzy output data X = (mX ; X )

(−1:6; 0:5) (−1:8; 0:5) (−1:0; 0:5) (1.2, 0.5) (2.2, 1.0) (6.8, 1.0) (10.0, 1.0) (10.0, 1.0) (10.0, 1.0)

(1, 0.5) (3, 0.5) (4, 0.5) (5.6, 0.8) (7.8, 0.8) (10.2, 0.8) (11.0, 1.0) (11.5, 1.0) (12.7, 1.0)

O

F

Y = (mY ; Y )

C

Coe2cient

M1 M2

0 0.01

500 500

B = (−2:457; 0:071); w = 0:857 b = − 5:507; w = 1:239

95.314 50.338

D

∗ for 1i ∈ (0; C):

EC

Example 1. From Gunn [11], data were constructed using the original xi , yi and symmetric fuzzi9ed Xi , Yi as shown in Table 1. Using the data in Table 1, the obtained results are shown in Table 2. In this example and C are heuristically determined. In fact, the values of C¿500 give almost same results. However, results depend highly on the value of .

O

U

11

Suppose that data pairs (x i ; Yi ); i = 1; 2; : : : ; l are observed where each x i is d-vector of real numbers and each Yi ∈T (R). Let xij be element of x i . Then, we assume xij ¿0 by simple translation of all vectors. Let W = (W1 ; W2 ; : : : ; Wd ), where Wi = (mWi ; Wi ; Wi ); Wi ; Wi ¿0; i = 1; : : : ; d and let B = (mB ; B ; B ); B ; B ¿0. We now consider the following model:

C

9

2.2. Case of numerical inputs and fuzzy output

N

7

R

R

5

Residual sum

For this model b is exactly same as mB for the model H 1 and hence mB = mYi − w; mXi − for 1i ∈ (0; C); mB = mYi − w; mXi +

3

PR

Model

TE

1

O

Table 2

M 3 : f(x) = B + W; x;

B ∈ T (R);

W ∈ T (R)d ;

x ∈ Rd

= B + W 1 x1 + W 2 x 2 + · · · + W d x d ; 13 15

where T (R) d is the set of d-vectors of triangular fuzzy numbers. We de9ne W2 = m W 2 + m W − W 2 + m W + W 2 , where m W = (mW1 ; : : : ; mWd ), W = (W1 ; : : : ; Wd ) and W = (W1 ; : : : ; Wd ). Then, we arrive at the following convex optimization

FSS 4046

ARTICLE IN PRESS 6

1

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

problem for model M 3 as follows: 3 l 1 2 W + C (ki + ∗ki ) 2 i=1

minimize

mW =

l

D

Then, constructing a Lagrange function and di:erentiating it with regard to mB ; B ; B , m W ; W ; W ; ki and ∗ki ; k = 1; 2, we can get ∗ (1i − 1i )xi ;

i=1 l

∗ ∗ [(1i − 1i ) − (2i − 2i )]xi ;

EC

W =

i=1 l

∗ ∗ [(3i − 3i ) − (1i − 1i )]xi

R

W =

TE

3

PR

O

O

subject to

F

k=1

mYi − (mW ; xi + mB ) 6 + 1i ; (mW ; xi + mB ) − mYi 6 + ∗1i ; (m − Yi ) − (mW ; xi + mB − W ; xi − B ) 6 + 2i ; Yi (mW ; xi + mB − W ; xi − B ) − (mYi − Yi ) 6 + ∗2i ; (mYi + Yi ) − (mW ; xi + mB + W ; xi + B ) 6 + 3i ; (mW ; xi + mB + W ; xi + B ) − (mYi + Yi ) 6 + ∗3i ; ki ; ∗ki ¿ 0; k = 1; 2; 3:

R

i=1

O

and in e:ect can derive the corresponding dual optimization problem for model M 3 as follows:

C

3 l ∗ 1 (ki − ki∗ )(kj − kj )xi ; xj − 2 k=1 i;j=1 l 3 l ∗ (ki + ki∗ ) + mYi (1i − 1i ) − k=1 i=1 i=1 l l ∗ ∗ + (mYi − Yi )(2i − 2i )+ (mYi + Yi )(3i − 3i )

U

N

maximize

subject to 5

l

i=1

i=1

(ki − ki∗ ) = 0;

i=1

k = 1; 2; 3;

ki ; ki∗ ∈ [0; C]; k = 1; 2; 3; i = 1; : : : ; l:

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

1

(

)

–

7

= max{ ; 0} and = max{ ; 0} for k = 1; : : : ; d. Now we de9ne two Let us de9ne W Wk Wk Wk k ) and = ( ; : : : ; ). Then we have vectors W = (W1 ; : : : ; W W1 Wd W d

f(x) = B + W ; x = B + (mW ; x; W ; x; W ; x); where W = (m W ; W ; W ). If Wi ¿0 and Wi ¿0 for i = 1; 2; : : : ; d, then we have

i=1 l

l i=1

∗

∗ ∗ [(1i − 1i ) − (2i − 2i )]xi ; x;

∗

[(3i − 3i ) − (1i − 1i )]xi ; x :

PR

i=1

Now we need to 9nd mB ; B and B . By KKT conditions, we can compute mB as follows: ( + 1i − mYi + w; mXi + mB ) = 0; 1i ∗ ( + ∗1i + mYi − w; mXi − mB ) = 0; 1i (C − ) = 0; (C − ∗ )∗ = 0 1i

1i

for 1i ∈ (0; C);

mB = mYi − w; mXi +

∗ for 1i ∈ (0; C):

R

l

|mYi − Yi − mW − W ; xi − mB + B |

C

i=1

R

To 9nd B ; B we need to solve the optimization problem given below

B ;B ¿0

13

U

N

+

11

EC

and hence mB = mYi − w; mXi −

minimize

9

1i

O

7

1i

TE

D

5

F

∗ (1i − 1i )xi ; x;

O

f(x) = B +

l

O

3

l

|mYi + Yi − mW + W ; xi − mB − B | :

i=1

Example 2. From Gunn [11], data were constructed using the original xi , yi and symmetric fuzzi9ed Yi as shown in Table 3. Using this data set, the obtained results are shown in Table 4. In this example = 0:001 and C = 500 are used but it is heuristically determined. In fact, the values of near 0 and C¿500 give almost same results. As seen from Table 4, we think this model is not appropriate for this particular data set because residual sum is too big.

FSS 4046

ARTICLE IN PRESS 8

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

Table 3 Real input–fuzzy output data x

(−1:6; 0:5) (−1:8; 0:5) (−1:0; 0:5) (1.2, 0.5) (2.2, 1.0) (6.8, 1.0) (10.0, 1.0) (10.0, 1.0) (10.0, 1.0)

1 3 4 5.6 7.8 10.2 11.0 11.5 12.7

O

O

F

Y = (mY ; Y )

Table 4

11 13 15 17

500

B = (−5:451; 0; 0); W = (1:217; 0:508; 0:508)

Residual sum

269.514

D

PR

0.001

TE

3. SV fuzzy nonlinear regression

R

EC

There have been only a few articles for fuzzy nonlinear regression. What researchers in fuzzy nonlinear regression were concerned with was data of the form with crisp inputs and fuzzy output. Some papers, for example [1,2,5], were concerned with the data set with fuzzy inputs and fuzzy output. However, we think those fuzzy nonlinear regression methods look somewhat unrealistic and treat the estimation procedures of some particular models such as linear, polynomial, exponential and logarithmic. In this paper, we treat fuzzy nonlinear regression for data of the form with numerical inputs and fuzzy output, without assuming the underlying model function. In other words, we extend the linear model M 3 to the nonlinear case Fig. 1. To do this, we will use the idea of SVM for crisp nonlinear regression [11,13,16–18]. The basic idea is that a nonlinear regression function is achieved by simply preprocessing input patterns x i by a map " : R d → F into some feature space F and then applying the standard ridge regression learning algorithm. Notice that the only way in which the data appears in algorithm for the model M 3 is in the form of inner products x i ; xj . The algorithm would only depend on the data through inner products in F, i.e. on functions of the form "(x i ); "(xj ). Hence it su2ces to know and use K(x i ; xj ) = "(x i ); "(xj ) instead of "(·) explicitly. The well used kernels for regression problem are given below

R

9

M3

O

7

Coe2cient

C

5

C

N

3

U

1

Model

K(x; y) = (x; y + 1)p : Polynomial kernel; K(x; y) = e

−

x − y 2 22

: Gaussian kernel;

K(x; y) = tanh( x; y + !): Hyperbolic tangent kernel:

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

9

Hence, by replacing x; y with K(x; y) we obtain the following dual optimization problem:

O

1

F

Fig. 1. The SVM fuzzy regression model M 2 with X = 0:75.

PR

O

3 l ∗ − 12 (ki − ki∗ )(kj − kj )K(xi ; xj ); k=1 i;j=1 l 3 l ∗ maximize − (ki + ki∗ ) + mYi (1i − 1i ); k=1 i=1 i=1 l l ∗ ∗ + (mYi − Yi )(2i − 2i )+ (mYi + Yi )(3i − 3i ): i=1

D

i=1

(ki − ki∗ ) = 0;

k = 1; 2; 3;

i=1

ki ; ki∗ ∈ [0; C];

R

l ∗ )−( −∗ )]K(x ; x )¿0 and ∗ ∗ If li=1 [(1i −1i 2i i j 2i i=1 [(3i −3i )−(1i −1i )]K(x i ; xj )¿0; j = 1; : : : ; l, then we have l l ∗ ∗ ∗ f(x) = B + (1i − 1i )K(xi ; x); [(1i − 1i ) − (2i − 2i )]K(xi ; x);

O

R

3

k = 1; 2; 3:

EC

l

TE

Here we should notice that the constraints are unchanged

C

i=1

∗

∗

[(3i − 3i ) − (1i − 1i )]K(xi ; x) :

N

l

i=1

5

U

i=1

Now we need to 9nd mB , B and B . By KKT conditions, we can compute mB as follows: l ∗ m = m − (1j − 1j )K(xj ; xi ) − for 1i ∈ (0; C); Y B i j=1 l mB = m Y − i

j=1

∗ (1j − 1j )K(xj ; xi ) +

∗ for 1i ∈ (0; C);

FSS 4046

ARTICLE IN PRESS 10

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

PR

O

To 9nd B ; B we need to solve the optimization problem given below

l

l

∗

m Yi − Yi − minimize (2j − 2j )K(xj ; xi ) − mB + B

B ;B ¿0

i=1

j=1

O

1

F

Fig. 2. The SV fuzzy nonlinear regression model.

5

15 17 19 21

O

C

13

In this paper, we have presented a SVM strategy for fuzzy multivariable linear and nonlinear regressions. The experimental results show that the SVM algorithm for fuzzy regression models derives the satisfying solutions and is an attractive approach to modeling fuzzy data. The algorithm combines generalization control with a technique to address the curse of dimensionality, which makes SVM solution not to depend directly on the dimensionality of the input space. The main formulation results in a global quadratic optimization problem with box constraints. However, this is not a computationally expensive way. The examples we considered here are too simple. SVM probably has greatest use when the dimensionality of the input space is high. We thus need to conduct real-life examples with several inputs. There have been some papers treat fuzzy nonlinear regression models. They usually assume the underlying model functions even for data of the form with numerical inputs and fuzzy output. The proposed algorithm here is model-free method in the sense that we do not have to assume

N

11

4. Conclusion

U

9

R

R

7

Example 3. We now apply fuzzy nonlinear regression model to the data in Table 3. According to Gunn [11], the nonlinear regression model is appropriate for the original crisp data of Table 3. When we apply fuzzy nonlinear regression model to this data set, we have the residual sum 0.833 and bias term B = (3:028; 0:875; 0:515). Hence we can recognize fuzzy nonlinear model is more appropriate than the linear model M 3. For this data set we use Gaussian kernel with = 1:0 and C = 500. These parameters are determined in the heuristic way Fig. 2.

EC

3

TE

D

l

l

∗

+ (3j − 3j )K(xj ; xi ) − mB − B

:

m Yi + Yi −

i=1

j=1

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

3 5 7

9

)

–

11

the underlying model function. This model-free method turned out to be a promising method which has been attempted to treat fuzzy nonlinear regression model with numerical inputs and fuzzy output. The main di:erence between our SVM approach and the nonlinear approaches by Buckley et al. [2,3] and Celmins [5] is not crisp input–fuzzy output versus fuzzy input–fuzzy output, but model-free versus model-dependent. Here we use kernel parameter and control parameter C determined in a heuristic way. The obvious question that arises is which are the best for a particular problem? Hence we need model selection method to determine these parameters. Acknowledgements

F

1

(

PR

O

O

This work was supported by Grant No. R01-2000-000-00011-0 from the Korea Science & Engineering Foundation. The authors wish to thank two referees for their valuable and constructive comments on an earlier version of this article. References

23 25 27 29 31 33 35 37 39

D

TE

EC

R

21

R

19

O

17

C

15

N

13

[1] B.E. Boser, I.M. Guyon, V. Vapnik, A training algorithm for optimal margin classi9ers, in: Fifth Annu. Workshop on Computational Learning Theory, Pittsburgh, ACM, New York, 1992. [2] J. Buckley, T. Feuring, Linear and non-linear fuzzy regression: evolutionary algorithm solutions, Fuzzy Sets and Systems 112 (2000) 381–394. [3] J. Buckley, T. Feuring, Y. Hayashi, Multivariate non-linear fuzzy regression: an evolutionary algorithm approach, Internat. J. Uncertainty, Fuzziness Knowledge-Based Systems 7 (1999) 83–98. [4] C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Knowledge Discovery and Data Mining, 1998. [5] A. Celmins, A practical approach to nonlinear fuzzy regression, SIAM J. Sci. Statist. Comput. 12 (3) (1991) 521– 546. [6] C. Cortes, V. Vapnik, Support vector networks, Mach. Learning 20 (1995) 273–297. [7] P. Diamond, Least squares 9tting of several fuzzy variables, in: J.C. Bezdek (Ed.), Analysis of Fuzzy Information, CRC Press, Tokyo, 1987, pp. 329–331. [8] P. Diamond, Fuzzy least squares, Inform. Sci. 46 (1988) 141–157. [9] H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, V. Vapnik, Support vector regression machines, in: M. Mozer, M. Jordan, T. Petsche (Eds.), Advances in Neural Information Processing Systems, vol. 9, MIT Press, Cambridge, MA, 1997, pp. 155 –161. [10] D. Dubois, H. Prade, Theory and Applications, Fuzzy Sets and Systems, Academic Press, New York, 1980. [11] S. Gunn, Support Vector Machines for Classi9cation and Regression, ISIS Tech. Report, University of Southampton, 1998. [12] J. Kacprzyk, M. Fedrizzi, Fuzzy Regression Analysis, Physica-Verlag, Heidelberg, 1992. [13] A.J. Smola, B. Scholkopf, A tutorial on support vector regression, NeuroCOLT2 Tech. Report, NeuroCOLT, 1998. [14] H. Tanaka, S. Uejima, K. Asia, Fuzzy linear regression model. Internat. Congress on Applied Systems Research and Cybernetics, Aculpoco, Mexico, 1980. [15] H. Tanaka, S. Uejima, K. Asia, Linear regression analysis with Fuzzy model, IEEE Trans. Systems Man. Cybernet. 12 (6) (1982) 903–907. [16] V. Vapnik, The Nature of Statistical Learning Theory, Springer, Berlin, 1995. [17] V. Vapnik, Statistical Learning Theory, Springer, Berlin, 1998. [18] V. Vapnik, S. Golowich, A. Smola, Support vector method for function approximation, regression estimation, and signal processing, Adv. Neural Inform. Process. Systems 9 (1996) 281–287.

U

11

pp: 1--11 (col.fig.: Nil)

PROD. TYPE: COM

ED: DK PAGN: Suresh -- SCAN: VKumar

ARTICLE IN PRESS

Fuzzy Sets and Systems

(

)

–

www.elsevier.com/locate/fss

1

Support vector fuzzy regression machines Dug Hun Honga;∗ , Changha Hwangb a

F

3

School of Mechanical and Automotive Engineering, Catholic University of Daegu, 330 Keumrak 1-ri Hayang-up Kyongsan, Kyungbuk 712 - 702, South Korea b Department of Statistical Information, Catholic University of Daegu, Kyungbuk 712 - 702, South Korea

7

Received 3 August 2001; received in revised form 9 October 2001; accepted 28 October 2002

O

O

5

13

D

11

Support vector machine (SVM) has been very successful in pattern recognition and function estimation problems. In this paper, we introduce the use of SVM for multivariate fuzzy linear and nonlinear regression models. Using the basic idea underlying SVM for multivariate fuzzy regressions gives computational e2ciency of getting solutions. c 2002 Published by Elsevier Science B.V.

TE

9

PR

Abstract

23 25 27

R

R

O

21

C

19

Linear regression models are widely used today in business, administration, economics, engineering, as well as in many other traditionally nonquantitative 9elds such as social, health, and biological sciences. In all cases of fuzzy regression, the linear regression is recommended for practical situations when decisions often have to be made on the basis of imprecise and/or partially available data. Many di:erent fuzzy regression approaches have been proposed. Fuzzy regression, as 9rst developed by Tanaka et al. [14] in a linear system, is based on the extension principle. Tanaka et al. [14] initially applied their fuzzy linear regression procedure to nonfuzzy experimental data. In the experiments that followed this pioneering e:ort, Tanaka et al. [15] used fuzzy input experimental data to build fuzzy regression models. Fuzzy input data used in these experiments were given in the form of triangular fuzzy numbers. The process is explained in more detail by Dubois and Prade [10]. A technique for linear least-squares 9tting of fuzzy variable was developed by Diamond [7,8] giving the solution to an analog of the normal equation of classical least squares. A collection of recent papers dealing with several approaches to fuzzy regression analysis can be found [12].

N

17

1. Introduction

U

15

EC

Keywords: Fuzzy inference systems; Support vector machine; Fuzzy regression; Fuzzy system models

∗

Corresponding author. Tel.: +82-53-850-2712; fax: +82-53-850-2799. E-mail address: [email protected] (D.H. Hong).

c 2002 Published by Elsevier Science B.V. 0165-0114/02/$ - see front matter PII: S 0 1 6 5 - 0 1 1 4 ( 0 2 ) 0 0 5 1 4 - 6

FSS 4046

ARTICLE IN PRESS 2

21 23

25 27 29 31 33 35 37

F

O

O

PR

19

D

17

TE

15

2. SVM fuzzy linear regression

EC

13

R

11

In this section, we will modify the underlying idea of SVM for the purpose of deriving the convex optimization problems for multivariate fuzzy linear regression models. In Section 3 we will consider multivariate fuzzy nonlinear regression model for numerical inputs and fuzzy output. The basic idea of SVM gives computational e2ciency in 9nding solutions of fuzzy regression models particularly for multivariate case. Here, we consider three di:erent models for multivariate fuzzy linear regression. To do this we need some preliminaries. Let X = (m; ; ) be a triangular fuzzy numbers when m is the model value of X and and are the left and right spreads respectively. If = , we can write X = (m; ). On the space T (R) of all triangular fuzzy numbers we use the metric d de9ned by

R

9

In contrast to fuzzy linear regression, there have been only a few articles on fuzzy nonlinear regression. What researchers in fuzzy nonlinear regression were concerned with was data of the form with crisp inputs and fuzzy output. However, some papers, for example [2,3,5], were concerned with the data set with fuzzy inputs and fuzzy output. By the way, in this paper we will treat fuzzy nonlinear regression for data of the form with crisp inputs and fuzzy output. In this paper, we discuss multivariate fuzzy linear and nonlinear regression by support vector machine (SVM). SVM has been recently introduced for solving pattern recognition and function estimation problems [1,4,6,9,11,13,16–18]. SVM is a nonlinear generalization of the Generalized Portrait algorithm developed in Russia in the 1960s. In its present form, the SVM was developed at AT&T Bell Laboratories by Vapnik and co-workers. Due to this industrial context, SVM research has up to date had a sound orientation towards real-world applications. SVM learning has now evolved into an active area of research. Moreover, it is in the process of entering the standard methods toolbox of machine learning. SVM is based on the idea of structural risk minimization, which shows that the generalization error is bounded by the sum of the training set and a term depending on the Vapnik–Chervonenkis dimension. By minimizing this bound, high generalization performance can be achieved. Moreover, unlike other machine learning methods, SVMs generalization error is not related to the problem’s input dimensionality. This explains why SVM can have good performance even in high dimensional problem. The main di:erence between our SVM approach and the nonlinear approaches by Buckley et al. [2,3] and Celmins [5] is not crisp input–fuzzy output versus fuzzy input–fuzzy output, but model-free versus model-dependent. The rest of this paper is organized as follows. Section 2 illustrates the SVM regression procedures for fuzzy multivariate linear models. Section 3 describes how to apply this idea to the fuzzy multivariate nonlinear model with numerical inputs and fuzzy output. Section 4 gives some conclusions.

O

7

–

C

5

)

N

3

(

U

1

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

d(X; Y )2 = (mX − mY )2 + ((mX − X ) − (mY − Y ))2 + ((mX + X ) − (mY + Y ))2 ;

where X = (mX ; X ; X ) and Y = (mY ; Y ; Y ) are any two vectors of triangular fuzzy numbers in T (R). A linear structure is de9ned on T (R) by (mX ; X ; X ) + (mY ; Y ; Y ) = (mX + mY ; X + Y ; X + Y ); t(m; ; ) = (tm; t; t) if t¿0, and t(m; ; ) = (tm; t; t) if t¡0.

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

9

2.1. Case of fuzzy inputs and fuzzy output Suppose that observations consist of data pairs (X i ; Yi ); i = 1; : : : ; l, where X i = ((mXi1 ; Xi1 ; Xi1 ); : : : ; (mXid ; Xid ; Xid ))∈T (R) d and Yi = (mYi ; Yi ; Yi )∈T (R). Here T (R) and T (R)d are the set of triangular fuzzy numbers and the set of d-vectors of triangular fuzzy numbers, respectively. Let mX i = (mXi1 ; : : : ; mXid ), Xi = (Xi1 ; : : : ; Xid ) and Xi = (Xi1 ; : : : ; Xid ). Now, we consider the following two models M 1 and M 2: M 1 : f(X) = B + w; X;

B ∈ T (R);

M 2 : f(X) = b + w; X;

b ∈ R;

w ∈ Rd ;

w ∈ Rd :

F

7

3

For computational simplicity we assume X i ; Yi and B are symmetric triangular fuzzy numbers. Then model M 1 can be rewritten as

O

5

–

O

3

)

M 1 : f(X) = (w; mX + mB ; |w|; X + B );

PR

1

(

where |w| = (|w1 |; |w2 |; : : : ; |wd |). Now, we are going to consider how to get solutions for two models M 1 and M 2. In fact, the model M 2 is straightforward from the model M 1. Model M 1: We arrive at the following convex optimization problem for the model M 1 by modifying the basic idea of SVM for crisp linear regression [11,13,16–18].

D

13

minimize

TE

11

2 l 1 2 w + C (ki + ∗ki ) 2 i=1

EC

k=1

R

mYi − w; mXi − mB 6 + 1i ; w; mXi + mB − mYi 6 + ∗1i ; (mYi − Yi ) − (w; mXi + mB − |w|; Xi − B ) 6 + 2i ; (w; mXi + mB − |w|; Xi − B ) − (mYi − Yi ) 6 + ∗2i ; ki ; ∗ki ¿ 0; k = 1; 2:

(1)

O

R

subject to

C

N

U

15

Then, constructing a Lagrange function and di:erentiating it with regard to mB ; B ; w; ki and ∗ki ; k = 1; 2, we can derive the corresponding dual optimization problem for model M 1 as follows:

maximize

subject to

2 l 2 1 (ki + ki∗ ) − 2 w − k=1 i=1 l i=1

+

l

i=1

∗ mYi (1i − 1i )+

l

i=1

∗ (mYi − Yi )(2i − 2i )

(ki − ki∗ ) = 0; ki ; ki∗ ∈ [0; C]; k = 1; 2;

FSS 4046

ARTICLE IN PRESS 4

1

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

l

∗ (1i − 1i )mXi +

i=1

l

∗ (2i − 2i )(mXi − |Xi |):

i=1

Solving the above equation under constraints determines the Lagrange multipliers ki ; ki∗ , and the optimal regression function is given by f(X) =

l

∗

(1i − 1i )mXi ; X +

l

(2)

i=1

F

i=1

∗ (2i − 2i )mXi − |Xi |; X + B:

Now we need to 9nd mB and B . By Karush–Kuhn–Tucker (KKT) conditions, we can compute mB as follows: ( + 1i − mYi + w; mXi + mB ) = 0; 1i ∗ ( + ∗1i + mYi − w; mXi − mB ) = 0; 1i (C − ) = 0; (C − ∗ )∗ = 0 1i

and hence mB = mYi − w; mXi −

for 1i ∈ (0; C);

mB = mYi − w; mXi +

∗ for 1i ∈ (0; C):

TE

11

1i

To 9nd B we need to solve the optimization problem given below minimize B ¿ 0

l

{|mYi − Yi − (w; mXi + mB − w; Xi − B )| };

EC

9

1i

D

1i

PR

O

7

–

O

5

)

where w=

3

(

i=1

where -insensitive loss function || is de9ned by 0 if || 6 ; || = || − otherwise:

15

Model M 2: Since the solution for the model M 2 is straightforward from the model M 1, we can easily get solutions as follows. Here the objective function to be maximized is the same as the one for the model M 1 but constraints are a little bit di:erent. We can compute w, mB and B in the same way as the one in the model M 1 2 l 2 1 − w − (ki + ki∗ ) 2 k=1 i=1 maximize l l + mY (1i − ∗ ) + (mY − Y )(2i − ∗ )

C

U

N

17

O

R

R

13

i=1

19

subject to

l 2 k=1 i=1

i

1i

i=1

i

i

(ki − ki∗ ) = 0 and ki ; ki∗ ∈ [0; C]; k = 1; 2:

2i

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

5

Table 1 Fuzzy input–fuzzy output data X = (mX ; X )

(−1:6; 0:5) (−1:8; 0:5) (−1:0; 0:5) (1.2, 0.5) (2.2, 1.0) (6.8, 1.0) (10.0, 1.0) (10.0, 1.0) (10.0, 1.0)

(1, 0.5) (3, 0.5) (4, 0.5) (5.6, 0.8) (7.8, 0.8) (10.2, 0.8) (11.0, 1.0) (11.5, 1.0) (12.7, 1.0)

O

F

Y = (mY ; Y )

C

Coe2cient

M1 M2

0 0.01

500 500

B = (−2:457; 0:071); w = 0:857 b = − 5:507; w = 1:239

95.314 50.338

D

∗ for 1i ∈ (0; C):

EC

Example 1. From Gunn [11], data were constructed using the original xi , yi and symmetric fuzzi9ed Xi , Yi as shown in Table 1. Using the data in Table 1, the obtained results are shown in Table 2. In this example and C are heuristically determined. In fact, the values of C¿500 give almost same results. However, results depend highly on the value of .

O

U

11

Suppose that data pairs (x i ; Yi ); i = 1; 2; : : : ; l are observed where each x i is d-vector of real numbers and each Yi ∈T (R). Let xij be element of x i . Then, we assume xij ¿0 by simple translation of all vectors. Let W = (W1 ; W2 ; : : : ; Wd ), where Wi = (mWi ; Wi ; Wi ); Wi ; Wi ¿0; i = 1; : : : ; d and let B = (mB ; B ; B ); B ; B ¿0. We now consider the following model:

C

9

2.2. Case of numerical inputs and fuzzy output

N

7

R

R

5

Residual sum

For this model b is exactly same as mB for the model H 1 and hence mB = mYi − w; mXi − for 1i ∈ (0; C); mB = mYi − w; mXi +

3

PR

Model

TE

1

O

Table 2

M 3 : f(x) = B + W; x;

B ∈ T (R);

W ∈ T (R)d ;

x ∈ Rd

= B + W 1 x1 + W 2 x 2 + · · · + W d x d ; 13 15

where T (R) d is the set of d-vectors of triangular fuzzy numbers. We de9ne W2 = m W 2 + m W − W 2 + m W + W 2 , where m W = (mW1 ; : : : ; mWd ), W = (W1 ; : : : ; Wd ) and W = (W1 ; : : : ; Wd ). Then, we arrive at the following convex optimization

FSS 4046

ARTICLE IN PRESS 6

1

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

problem for model M 3 as follows: 3 l 1 2 W + C (ki + ∗ki ) 2 i=1

minimize

mW =

l

D

Then, constructing a Lagrange function and di:erentiating it with regard to mB ; B ; B , m W ; W ; W ; ki and ∗ki ; k = 1; 2, we can get ∗ (1i − 1i )xi ;

i=1 l

∗ ∗ [(1i − 1i ) − (2i − 2i )]xi ;

EC

W =

i=1 l

∗ ∗ [(3i − 3i ) − (1i − 1i )]xi

R

W =

TE

3

PR

O

O

subject to

F

k=1

mYi − (mW ; xi + mB ) 6 + 1i ; (mW ; xi + mB ) − mYi 6 + ∗1i ; (m − Yi ) − (mW ; xi + mB − W ; xi − B ) 6 + 2i ; Yi (mW ; xi + mB − W ; xi − B ) − (mYi − Yi ) 6 + ∗2i ; (mYi + Yi ) − (mW ; xi + mB + W ; xi + B ) 6 + 3i ; (mW ; xi + mB + W ; xi + B ) − (mYi + Yi ) 6 + ∗3i ; ki ; ∗ki ¿ 0; k = 1; 2; 3:

R

i=1

O

and in e:ect can derive the corresponding dual optimization problem for model M 3 as follows:

C

3 l ∗ 1 (ki − ki∗ )(kj − kj )xi ; xj − 2 k=1 i;j=1 l 3 l ∗ (ki + ki∗ ) + mYi (1i − 1i ) − k=1 i=1 i=1 l l ∗ ∗ + (mYi − Yi )(2i − 2i )+ (mYi + Yi )(3i − 3i )

U

N

maximize

subject to 5

l

i=1

i=1

(ki − ki∗ ) = 0;

i=1

k = 1; 2; 3;

ki ; ki∗ ∈ [0; C]; k = 1; 2; 3; i = 1; : : : ; l:

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

1

(

)

–

7

= max{ ; 0} and = max{ ; 0} for k = 1; : : : ; d. Now we de9ne two Let us de9ne W Wk Wk Wk k ) and = ( ; : : : ; ). Then we have vectors W = (W1 ; : : : ; W W1 Wd W d

f(x) = B + W ; x = B + (mW ; x; W ; x; W ; x); where W = (m W ; W ; W ). If Wi ¿0 and Wi ¿0 for i = 1; 2; : : : ; d, then we have

i=1 l

l i=1

∗

∗ ∗ [(1i − 1i ) − (2i − 2i )]xi ; x;

∗

[(3i − 3i ) − (1i − 1i )]xi ; x :

PR

i=1

Now we need to 9nd mB ; B and B . By KKT conditions, we can compute mB as follows: ( + 1i − mYi + w; mXi + mB ) = 0; 1i ∗ ( + ∗1i + mYi − w; mXi − mB ) = 0; 1i (C − ) = 0; (C − ∗ )∗ = 0 1i

1i

for 1i ∈ (0; C);

mB = mYi − w; mXi +

∗ for 1i ∈ (0; C):

R

l

|mYi − Yi − mW − W ; xi − mB + B |

C

i=1

R

To 9nd B ; B we need to solve the optimization problem given below

B ;B ¿0

13

U

N

+

11

EC

and hence mB = mYi − w; mXi −

minimize

9

1i

O

7

1i

TE

D

5

F

∗ (1i − 1i )xi ; x;

O

f(x) = B +

l

O

3

l

|mYi + Yi − mW + W ; xi − mB − B | :

i=1

Example 2. From Gunn [11], data were constructed using the original xi , yi and symmetric fuzzi9ed Yi as shown in Table 3. Using this data set, the obtained results are shown in Table 4. In this example = 0:001 and C = 500 are used but it is heuristically determined. In fact, the values of near 0 and C¿500 give almost same results. As seen from Table 4, we think this model is not appropriate for this particular data set because residual sum is too big.

FSS 4046

ARTICLE IN PRESS 8

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

Table 3 Real input–fuzzy output data x

(−1:6; 0:5) (−1:8; 0:5) (−1:0; 0:5) (1.2, 0.5) (2.2, 1.0) (6.8, 1.0) (10.0, 1.0) (10.0, 1.0) (10.0, 1.0)

1 3 4 5.6 7.8 10.2 11.0 11.5 12.7

O

O

F

Y = (mY ; Y )

Table 4

11 13 15 17

500

B = (−5:451; 0; 0); W = (1:217; 0:508; 0:508)

Residual sum

269.514

D

PR

0.001

TE

3. SV fuzzy nonlinear regression

R

EC

There have been only a few articles for fuzzy nonlinear regression. What researchers in fuzzy nonlinear regression were concerned with was data of the form with crisp inputs and fuzzy output. Some papers, for example [1,2,5], were concerned with the data set with fuzzy inputs and fuzzy output. However, we think those fuzzy nonlinear regression methods look somewhat unrealistic and treat the estimation procedures of some particular models such as linear, polynomial, exponential and logarithmic. In this paper, we treat fuzzy nonlinear regression for data of the form with numerical inputs and fuzzy output, without assuming the underlying model function. In other words, we extend the linear model M 3 to the nonlinear case Fig. 1. To do this, we will use the idea of SVM for crisp nonlinear regression [11,13,16–18]. The basic idea is that a nonlinear regression function is achieved by simply preprocessing input patterns x i by a map " : R d → F into some feature space F and then applying the standard ridge regression learning algorithm. Notice that the only way in which the data appears in algorithm for the model M 3 is in the form of inner products x i ; xj . The algorithm would only depend on the data through inner products in F, i.e. on functions of the form "(x i ); "(xj ). Hence it su2ces to know and use K(x i ; xj ) = "(x i ); "(xj ) instead of "(·) explicitly. The well used kernels for regression problem are given below

R

9

M3

O

7

Coe2cient

C

5

C

N

3

U

1

Model

K(x; y) = (x; y + 1)p : Polynomial kernel; K(x; y) = e

−

x − y 2 22

: Gaussian kernel;

K(x; y) = tanh( x; y + !): Hyperbolic tangent kernel:

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

9

Hence, by replacing x; y with K(x; y) we obtain the following dual optimization problem:

O

1

F

Fig. 1. The SVM fuzzy regression model M 2 with X = 0:75.

PR

O

3 l ∗ − 12 (ki − ki∗ )(kj − kj )K(xi ; xj ); k=1 i;j=1 l 3 l ∗ maximize − (ki + ki∗ ) + mYi (1i − 1i ); k=1 i=1 i=1 l l ∗ ∗ + (mYi − Yi )(2i − 2i )+ (mYi + Yi )(3i − 3i ): i=1

D

i=1

(ki − ki∗ ) = 0;

k = 1; 2; 3;

i=1

ki ; ki∗ ∈ [0; C];

R

l ∗ )−( −∗ )]K(x ; x )¿0 and ∗ ∗ If li=1 [(1i −1i 2i i j 2i i=1 [(3i −3i )−(1i −1i )]K(x i ; xj )¿0; j = 1; : : : ; l, then we have l l ∗ ∗ ∗ f(x) = B + (1i − 1i )K(xi ; x); [(1i − 1i ) − (2i − 2i )]K(xi ; x);

O

R

3

k = 1; 2; 3:

EC

l

TE

Here we should notice that the constraints are unchanged

C

i=1

∗

∗

[(3i − 3i ) − (1i − 1i )]K(xi ; x) :

N

l

i=1

5

U

i=1

Now we need to 9nd mB , B and B . By KKT conditions, we can compute mB as follows: l ∗ m = m − (1j − 1j )K(xj ; xi ) − for 1i ∈ (0; C); Y B i j=1 l mB = m Y − i

j=1

∗ (1j − 1j )K(xj ; xi ) +

∗ for 1i ∈ (0; C);

FSS 4046

ARTICLE IN PRESS 10

D.H. Hong, C. Hwang / Fuzzy Sets and Systems

(

)

–

PR

O

To 9nd B ; B we need to solve the optimization problem given below

l

l

∗

m Yi − Yi − minimize (2j − 2j )K(xj ; xi ) − mB + B

B ;B ¿0

i=1

j=1

O

1

F

Fig. 2. The SV fuzzy nonlinear regression model.

5

15 17 19 21

O

C

13

In this paper, we have presented a SVM strategy for fuzzy multivariable linear and nonlinear regressions. The experimental results show that the SVM algorithm for fuzzy regression models derives the satisfying solutions and is an attractive approach to modeling fuzzy data. The algorithm combines generalization control with a technique to address the curse of dimensionality, which makes SVM solution not to depend directly on the dimensionality of the input space. The main formulation results in a global quadratic optimization problem with box constraints. However, this is not a computationally expensive way. The examples we considered here are too simple. SVM probably has greatest use when the dimensionality of the input space is high. We thus need to conduct real-life examples with several inputs. There have been some papers treat fuzzy nonlinear regression models. They usually assume the underlying model functions even for data of the form with numerical inputs and fuzzy output. The proposed algorithm here is model-free method in the sense that we do not have to assume

N

11

4. Conclusion

U

9

R

R

7

Example 3. We now apply fuzzy nonlinear regression model to the data in Table 3. According to Gunn [11], the nonlinear regression model is appropriate for the original crisp data of Table 3. When we apply fuzzy nonlinear regression model to this data set, we have the residual sum 0.833 and bias term B = (3:028; 0:875; 0:515). Hence we can recognize fuzzy nonlinear model is more appropriate than the linear model M 3. For this data set we use Gaussian kernel with = 1:0 and C = 500. These parameters are determined in the heuristic way Fig. 2.

EC

3

TE

D

l

l

∗

+ (3j − 3j )K(xj ; xi ) − mB − B

:

m Yi + Yi −

i=1

j=1

FSS 4046

ARTICLE IN PRESS D.H. Hong, C. Hwang / Fuzzy Sets and Systems

3 5 7

9

)

–

11

the underlying model function. This model-free method turned out to be a promising method which has been attempted to treat fuzzy nonlinear regression model with numerical inputs and fuzzy output. The main di:erence between our SVM approach and the nonlinear approaches by Buckley et al. [2,3] and Celmins [5] is not crisp input–fuzzy output versus fuzzy input–fuzzy output, but model-free versus model-dependent. Here we use kernel parameter and control parameter C determined in a heuristic way. The obvious question that arises is which are the best for a particular problem? Hence we need model selection method to determine these parameters. Acknowledgements

F

1

(

PR

O

O

This work was supported by Grant No. R01-2000-000-00011-0 from the Korea Science & Engineering Foundation. The authors wish to thank two referees for their valuable and constructive comments on an earlier version of this article. References

23 25 27 29 31 33 35 37 39

D

TE

EC

R

21

R

19

O

17

C

15

N

13

[1] B.E. Boser, I.M. Guyon, V. Vapnik, A training algorithm for optimal margin classi9ers, in: Fifth Annu. Workshop on Computational Learning Theory, Pittsburgh, ACM, New York, 1992. [2] J. Buckley, T. Feuring, Linear and non-linear fuzzy regression: evolutionary algorithm solutions, Fuzzy Sets and Systems 112 (2000) 381–394. [3] J. Buckley, T. Feuring, Y. Hayashi, Multivariate non-linear fuzzy regression: an evolutionary algorithm approach, Internat. J. Uncertainty, Fuzziness Knowledge-Based Systems 7 (1999) 83–98. [4] C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Knowledge Discovery and Data Mining, 1998. [5] A. Celmins, A practical approach to nonlinear fuzzy regression, SIAM J. Sci. Statist. Comput. 12 (3) (1991) 521– 546. [6] C. Cortes, V. Vapnik, Support vector networks, Mach. Learning 20 (1995) 273–297. [7] P. Diamond, Least squares 9tting of several fuzzy variables, in: J.C. Bezdek (Ed.), Analysis of Fuzzy Information, CRC Press, Tokyo, 1987, pp. 329–331. [8] P. Diamond, Fuzzy least squares, Inform. Sci. 46 (1988) 141–157. [9] H. Drucker, C.J.C. Burges, L. Kaufman, A. Smola, V. Vapnik, Support vector regression machines, in: M. Mozer, M. Jordan, T. Petsche (Eds.), Advances in Neural Information Processing Systems, vol. 9, MIT Press, Cambridge, MA, 1997, pp. 155 –161. [10] D. Dubois, H. Prade, Theory and Applications, Fuzzy Sets and Systems, Academic Press, New York, 1980. [11] S. Gunn, Support Vector Machines for Classi9cation and Regression, ISIS Tech. Report, University of Southampton, 1998. [12] J. Kacprzyk, M. Fedrizzi, Fuzzy Regression Analysis, Physica-Verlag, Heidelberg, 1992. [13] A.J. Smola, B. Scholkopf, A tutorial on support vector regression, NeuroCOLT2 Tech. Report, NeuroCOLT, 1998. [14] H. Tanaka, S. Uejima, K. Asia, Fuzzy linear regression model. Internat. Congress on Applied Systems Research and Cybernetics, Aculpoco, Mexico, 1980. [15] H. Tanaka, S. Uejima, K. Asia, Linear regression analysis with Fuzzy model, IEEE Trans. Systems Man. Cybernet. 12 (6) (1982) 903–907. [16] V. Vapnik, The Nature of Statistical Learning Theory, Springer, Berlin, 1995. [17] V. Vapnik, Statistical Learning Theory, Springer, Berlin, 1998. [18] V. Vapnik, S. Golowich, A. Smola, Support vector method for function approximation, regression estimation, and signal processing, Adv. Neural Inform. Process. Systems 9 (1996) 281–287.

U

11