Performance Evaluation of Least Squares SVR in Robust Dynamical ...

47 downloads 0 Views 722KB Size Report
Abstract. Least Squares Support Vector Regression (LS-SVR) is a po- werful kernel-based learning tool for regression problems. Nonlinear sys-.
Performance Evaluation of Least Squares SVR in Robust Dynamical System Identification Jos´e Daniel A. Santos1 , C´esar Lincoln C. Mattos2 and Guilherme A. Barreto3 Federal Institute of Education, Science and Technology of Cear´ a, Department of Industry, Maracana´ u, Cear´ a, Brazil, 1 [email protected], Federal University of Cear´ a, Department of Teleinformatics Engineering, Center of Technology, Campus of Pici, Fortaleza, Cear´ a, Brazil, 2 [email protected], 3 [email protected]

Abstract. Least Squares Support Vector Regression (LS-SVR) is a powerful kernel-based learning tool for regression problems. Nonlinear system identification is one of such problems where we aim at capturing the behavior in time of a dynamical system by building a black-box model from the measured input-output time series. Besides the difficulties involved in the specification a suitable model itself, most real-world systems are subject to the presence of outliers in the observations. Hence, robust methods that can handle outliers suitably are desirable. In this regard, despite the existence of a few previous works on robustifying the LS-SVR for regression applications with outliers, its use for dynamical system identification has not been fully evaluated yet. Bearing this in mind, in this paper we assess the performances of two existing robust LS-SVR variants, namely WLS-SVR and RLS-SVR, in nonlinear system identification tasks containing outliers. These robust approaches are compared with standard LS-SVR in experiments with three artificial datasets, whose outputs are contaminated with different amounts of outliers, and a real-world benchmarking dataset. The obtained results for infinite step ahead prediction confirm that the robust LS-SVR variants consistently outperforms the standard LS-SVR algorithm. Keywords: Least Squares Support Vector Regression, Nonlinear Dynamical System Identification, NARX Model, Outliers.

1

Introduction

Least Squares Support Vector Machine (LS-SVM) is a widely used tool for classification [16] and regression [14, 15] in the fields of pattern recognition and machine learning. Particularly in regression, when the model is called Least Squares Support Vector Regression (LS-SVR), it has found broad applicability in areas such as time series forecasting [18], control [6] and system identification [1, 3]. Motivated by the theory of Support Vector Machines (SVMs) [19], LS-SVR uses a Sum-of-Squared-Error (SSE) cost function and equality constraints to replace the original convex Quadratic Programming (QP) optimization problem

2

J. D. A. Santos, C. L. C. Mattos, G. A. Barreto

in SVM. Consequently, the global optimum is simpler to obtain by solving a set of linear equations. However, despite this computationally attractive feature, the use of a SSE-based cost function can lead to estimates which are too sensitive to the presence of outliers in the data or when the underlying assumption of Gaussian distribution for the error variables is not realistic. In this scenario, LS-SVR may present poor prediction performance on new data. Despite the importance of robust methods for real-world applications, very few authors have developed learning strategies for LS-SVR models to handle outliers suitably during the training process. For example, Suykens et al. [17] were probably the first to propose a robust variant for the LS-SVR by introducing a weighted version of it based on M -Estimators [5]. Another approach worth mentioning is developed by Chuang et al. [20], who introduced an iterative method based on truncated least squares loss function, Concave-Convex Procedure (CCCP) and Newton algorithm. It is worth noting that while the aforementioned robust LS-SVR variants were developed for standard regression problems, their applications to regression with time series data, especially to nonlinear system identification, is still an open issue. Nonlinear dynamical system identification is a complex problem which can be roughly understood as a set of well-defined nonlinear regression tools applied to modeling systems with memory (i.e. dynamics), aiming at describing the behavior in time of such systems, for control applications and for simulation purposes only. Only recently, few authors has started to investigate LS-SVR robust versions for dynamical nonlinear black-box regression tasks [9, 8, 4]. However, these works evaluated their proposed models using results from one-step-ahead prediction tasks. For a more complete validation of a model in a dynamical system identification task, infinite steps ahead predictions (a.k.a free simulation) are strongly recommended, since one can judge correctly if the model has indeed captured in the long term the relevant dynamics of the system under study. As a consequence, it turns out that infinite steps ahead prediction is a problem far less trivial than one step ahead prediction. From the exposed, the scope of this paper encompasses a comprehensive evaluation of the behavior of two robust LS-SVR variants, namely, Weighted Least Squares Support Vector Regression (WLS-SVR) [17] and Robust Least Squares Support Vector Regression (RLS-SVR) [20], in nonlinear system identification tasks. For this purpose, we use three synthetic datasets, whose outputs are deliberately contaminated with different amounts of outliers, and a real-world benchmarking dataset. The validation scenarios of the performance comparison to be carried out correspond to infinite steps ahead prediction tasks. The remainder of this paper is organized as follows. In Section 2 we briefly discuss the robust nonlinear system identification with NARX models. In Section 3, LS-SVR, WLS-SVR and RLS-SVR models are described. In Section 4 the results of a comprehensive set of computer experiments are presented with the paper being concluded in Section 5.

Performance Evaluation of LS-SVR in Robust System Identification

2

3

Robust Nonlinear Dynamical System Identification

Given a dynamical system that could be explained by a nonlinear autoregressive with exogenous inputs (NARX) model, its i-th input vector xi ∈ RP is obtained from Ly past observed outputs yi ∈ R and Lu past control inputs ui ∈ R [10] i ∼ N (i |0, σn2 ),

(1)

xi = [yi−1 , yi−2 , · · · , yi−Ly , ui−1 , ui−2 , · · · , ui−Lu ]T ,

(2)

y i = m i + i ,

mi = g(xi ),

where i is the instant of observation, mi ∈ R is the true (noiseless) output of the system, g(·) is an unknown nonlinear function and i is a Gaussian distributed observation noise. After N instants, we have the dataset D = (xi , yi )|N i=1 = (X, y),

(3)

where X ∈ RN ×P is called regressor matrix and y ∈ RN is the vector of measured outputs. From the set D, henceforth called the estimation data, we aim at building a model that explains well enough the dynamical behavior of the system under study. After the estimation of a suitable model, it may be used to simulate the dynamical output of the identified system through iterative predictions. Given a new instant j, the prediction for test data follows yˆj = f (xj ) + j ,

(4) T

xj = [ˆ yj−1 , yˆj−2 , · · · , yˆj−Ly , uj−1 , uj−2 , · · · , uj−Lu ] ,

(5)

where yˆj is the j-th estimated noisy output. This procedure, in which past estimated outputs are used as regressors, is usually called free simulation or infinite step ahead prediction and will be adopted in this paper. When the observed noise cannot be considered Gaussian, as in the presence of outliers, models obtained through Eq. (1) are not appropriated. In fact, the light tails of the Gaussian distribution are not able to justify the error deviations caused by the outliers. In this paper we are interested in evaluating the performance of outlier-robust LS-SVR models in the identification of nonlinear dynamical system. We present these models in the next section.

3

Evaluated Models

Initially, let us consider the estimation dataset {(x1 , y1 ), . . . , (xN , yN )}, with the inputs xi ∈ Rp and correspondent outputs yi ∈ R. In a regression problem, the goal is to search for a function f (·) that approximates, with acceptable accuracy, the outputs yi for all instances of the available data. For nonlinear case, f usually takes the form f (x) = hw, ϕ(x)i + b, with w ∈ RP , b ∈ R,

(6)

4

J. D. A. Santos, C. L. C. Mattos, G. A. Barreto

where h·, ·i denotes the dot-product in space of the input patterns, w is a vector of weights, b is a bias and ϕ(·) is a nonlinear map into some dot-product space H, usually called feature space. The formulation of the parameter estimation problem in LS-SVR leads to the minimization of the following functional [15, 14] N

J(w, e) =

1X 2 1 kwk22 + C e , 2 2 i=1 i

(7)

subject to yi = hw, ϕ(xi )i + b + ei ,

i = 1, 2, . . . , N

(8)

where ei = yi − f (xi ) is the error due to the i-th input pattern and C > 0 is a regularization parameter. The Lagrangian function of the optimization problem in Eqs. (7) and (8) is

L(w, b, e, α) =

N N 1X 2 X 1 kwk22 + C ei − αi [hw, ϕ(xi )i + b + ei − yi ], 2 2 i=1 i=1

(9)

where αi ’s are the Lagrange multipliers. The conditions for optimality are given by  PN ∂L  = i=1 αi ϕ(xi ),  ∂w = 0 =⇒ w  P  ∂L N i=1 αi = 0, ∂ei = 0 =⇒ (10) ∂L   ∂b = 0 =⇒ αi = Cei ,   ∂L = 0 =⇒ hw, ϕ(x )i + b + e − y = 0, i i i ∂αi for i = 1, 2, · · · , N . After elimination of the variables ei and w, the optimal dual variables correspond to the solution of the following system of linear equations 

0 1

1T Ω + C −1 I

    b 0 = , α y

(11)

where y = [y1 , y2 , . . . , yn ]T , 1 = [1, 1, . . . , 1]T , α = [α1 , α2 , . . . , αn ]T , Ω ∈ RN ×N is the kernel matrix whose entries are Ωi,j = k(xi , xj ) = hϕ(xi ), ϕ(xj )i where k(·, ·) is the chosen kernel function. The resulting LS-SVR model for nonlinear regression is given by f (x) =

N X

αi k(x, xi ) + b,

(12)

i=1

where α and b are the solutionn of the linear system in Eq. (11). The Gaussian o kx−xi k22 kernel function k(x, xi ) = exp was adopted in all the experiments in 2σ 2 this paper.

Performance Evaluation of LS-SVR in Robust System Identification

3.1

5

The WLS-SVR Model

The Weighted Least Squares Support Vector Regression (WLS-SVR), developed by Suykens et. al. [17], is described by the minimization of the functional N

J(w, e) =

1 1X 2 kwk22 + C vi e i , 2 2 i=1

(13)

subject to Eq. (8). v = [v1 , . . . , vN ]T is a vector of weights associated with the estimation data. If vk = 0, one can delete the corresponding data sample from the model. In the same way of LS-SVR, the optimal dual variables are given by the solution of the following system of linear equations      b 0 0 1T = , (14) y 1 Ω + C −1 V α where the diagonal matrix V ∈ RN ×N is given by   1 1 ,..., . V = diag v1 vN

(15)

The weights vi are determined based on the error variables ei = αi /C from the original LS-SVR approach in Eq. (11). In this paper, the robust estimates are obtained from Hampel weight function [13, 17] as follows

vi =

 1 

if |ei /ˆ s| ≤ c1 , if c1 < |ei /ˆ s| ≤ c2 , otherwise,

c2 −|ei /ˆ s| c2 −c1 −4

10

(16)

where sˆ = IQR/1.349 is a robust estimate of the standard deviation of the LSSVR error variables ei . IQR stands for Interquantile range, which is the difference between the 75th percentile and 25th percentile and the constants c1 , c2 are typically chosen as c1 = 2.5 and c2 = 3.0 [13]. 3.2

The RLS-SVR Model

Let us consider again the WLS-SVR optimization problem in Eqs. (13) and (8) that is equivalent to the following unconstrained functional N

J(w, b) =

1 1X kwk22 + C vi (yi − (hw, ϕ(xi )i + b))2 . 2 2 i=1

(17)

In order to avoid setting the weights of the estimation samples, Yang et al. [20] developed a robust approach, called Robust Least Squares Support Vector Regression (RLS-SVR), through the minimization of another unconstrained functional given by

6

J. D. A. Santos, C. L. C. Mattos, G. A. Barreto

N

J(w, b) =

1 1X kwk22 + C robust2 (w, b, xi , yi ), 2 2 i=1

(18)

where robust2 (·) is a truncated least squares loss function robust2 (w, b, x, y) = min{p, (yi − (hw, ϕ(xi )i + b))2 },

(19)

where p ≥ 0 is the truncation parameter, which controls the errors reducing the effects of the outliers. Let the error of the estimation sample be ei = yi − (hw, ϕ(xi )i + b), then the loss function of the optimization problem in Eq. (18) can be rewritten by  2 √ ei , if |ei | ≤ p, 2 √ (20) robust2 (p, ei ) = min{p, ei } = p, if |ei | > p. It may be seen easily in Eq. (20) that when p is large enough , the solution of RLS-SVR is the same as LS-SVR. In this paper, we set 0 ≤ p ≤ 1 in all the experiments. The function robust2 (·) is neither differentiable nor convex. In order to overcome that difficulty, the RLS-SVR approach firstly performs a smoothing procedure to make the loss function robust2 (·) [2]. Then, considering zi = hw, ϕ(xi )i+ b, we can write robust2 (w, b, xi , yi ) = min{p, (yi − zi )2 } = (yi − zi )2 + h(zi ), where

 h(zi ) =

√ √ 0, yi − p ≤ zi ≤ yi + p, p − (yi − zi )2 , otherwise.

(21) (22)

It is possible to note that h(·) is a non-smooth function. In order to solve that problem, h(·) is replaced with another smoothing function h∗ (·) given by √

√ p − h or zi > yi + p + h √ |zi − yi + p| ≤ h √ √ yi + h − p < zi < yi − h + p √ |zi − yi − p| ≤ h, (23) where h is the smoothing parameter, typically taking its values between 0.001 and 0.5. Now the function h∗ (·) is continuous and twice-differentiable. Then, the functional in Eq. (18) can rewritten as  p − (yi − zi )2 ,     (h+2√p)(yi +h−√p−zi )2 − , 4h h∗ (zi ) = 0,     −(h+2√p)(yi −h+√p−zi )2 , 4h

zi < yi −

N

Jrob (w, b) = where

N

1 1X 1X ∗ kwk22 + C (yi − zi )2 + C h (zi ), 2 2 i=1 2 i=1

(24)

N

1 1X Jvex (w, b) = kwk22 + C (yi − zi )2 , 2 2 i=1

(25)

Performance Evaluation of LS-SVR in Robust System Identification

7

N

Jcav (w, b) = C

1X ∗ h (zi ). 2 i=1

(26)

Since Jcav is non-convex, it is difficult to minimize the functional Jrob by classical convex optimization algorithms. The next step is using concave-convex procedure (CCCP) [21] to transform a concave-convex optimization problem into a iteratively series of convex optimization problems. Finally, it is applied the Newton algorithm [2] to solve the series of convex optimization problems. The above steps are detailed in [21, 20, 2]. Due to the lack of space, only the final iterative formula to obtain the Lagrange multipliers and the bias is shown below  t+1     b 0 0 1T = − , (27) αt+1 1 I + CΩ λt − Cy where the vector λ is iteratively  C(yi − zi ),    C(h+2√p)(yi +h−√p−zi ) , 4h λti = 0,    C(h+2√p)(yi −h+√p−zi ) , 4h

calculated by √ √ zi < yi − p − h or zi > yi + p + h √ |zi − yi + p| ≤ h √ √ yi + h − p < zi < yi − h + p √ |zi − yi − p| ≤ h.

(28)

Given a tolerance parameter ε, bt+1 and αt+1 must be calculated using the Eqs. (27) and (28). It must check if ||(bt+1 , αt+1 ) − (bt , αt )|| < ε holds to end the process. Otherwise, do t = t + 1 and repeat the algorithm.

4

Simulations and Discussion

In order to evaluate the performances of the previously described models in nonlinear system identification under infinite step ahead prediction scenarios, we will carry out computer experiments with three artificial datasets, whose outputs are contaminated with outliers, and a real-world benchmarking dataset. The first dataset, labeled Artificial 1, was generated according to [7] and is giving by yi = yi−1 − 0.5 tanh(yi−1 + u3i−1 ), ui ∼ N (ui |0, 1), − 1 ≤ ui ≤ 1, for both estimation and test data.

(29) (30)

The dataset contains 150 samples for estimation and 150 for test. The estimation data was corrupted with additive Gaussian noise with zero mean and variance 0.0025. The following artificial datasets were generated according the seminal work by Narendra and Parthasarathy [12]. The first one, labeled Artificial 2 is given by yi−1 yi−2 (yi−1 + 2.5) + ui−1 2 2 1 + yi−1 + yi−2  U(−2, 2), for estimation data ui = , sin(2πi/25), for test data yi =

(31) (32)

8

J. D. A. Santos, C. L. C. Mattos, G. A. Barreto

Table 1. RMSE values from free simulation results with and without outliers in artificial datasets. Artificial 1 outliers % 0% 5% 10% LS-SVR 0.0309 0.0654 0.1395 WLS-SVR 0.0329 0.0626 0.1288 RLS-SVR 0.0322 0.0682 0.1083 Artificial 2 outliers % 0% 5% 10% LS-SVR 0.2805 0.3808 0.4937 WLS-SVR 0.2363 0.3869 0.4634 RLS-SVR 0.2854 0.4530 0.4036 Artificial 3 outliers % 0% 5% 10% LS-SVR 0.2993 0.3150 0.6756 WLS-SVR 0.2890 0.2915 0.2976 RLS-SVR 0.2891 0.2921 0.2823

20% 0.1518 0.1174 0.1134 20% 0.8804 0.6209 0.6089 20% 0.6467 0.3973 0.4606

where U(−2, 2) is a random number uniformly distributed between −2 and 2. The dataset contains 300 samples for estimation and 100 samples for test. The estimation data was corrupted with additive Gaussian noise with zero mean and variance 0.29. The last artificial dataset, labeled Artificial 3, is given by yi−1 + u3i−1 2 1 + yi−1  U(−2, 2), for estimation data ui = . sin(2πi/25) + sin(2πi/10), for test data yi =

(33) (34)

Once again the dataset contains 300 samples for estimation and 100 test samples. The estimation data was corrupted with additive Gaussian noise with zero mean and variance 0.65. The real-word dataset is called wing flutter and is available at DaISy repository of Katholieke Universiteit Leuven 1 . This dataset corresponds to a mechanical SISO (Single Input Single Output) system with 1024 samples of each sequence ui and yi , in which 512 samples were used for estimation and the other 512 for test. All the artificial datasets were progressively corrupted with a number of outliers equal to 5%, 10% and 20% of the estimation samples. To each randomly chosen sample was added a uniformly distributed value U(−My , +My ), where My is the maximum absolute output value. Such outliers contamination methodology is similar to the one performed in [11]. The orders Lu and Ly chosen for the regressors of each artificial dataset were set to their largest delays according to Eqs. (29), (31) and (33). For the 1

http://homes.esat.kuleuven.be/smc/daisy/daisydata.html

Performance Evaluation of LS-SVR in Robust System Identification

9

Table 2. Free simulation RMSE with wing flutter dataset. Real dataset LS-SVR WLS-SVR RLS-SVR 0.6433 0.4709 0.6433

wing flutter dataset the orders Lu , Ly ∈ {1, 2, 3, 4, 5} were set after execution of a 5-fold cross validation strategy. The same strategy was used to set the hyperparameters C ∈ {2−5 , 2−4 , . . . , 220 } and σ ∈ {2−10 , 2−9 , . . . , 210 } in the search for their optimal values. Furthermore, for the RLS-SVR model, a new 5-fold cross validation is then performed to search for the optimal values of truncation parameter p ∈ {0.1, 0.2, . . . , 1.0} and the smoothing parameter h ∈ {0.01, 0.05, 0.10, 0.15, 0.20, 0.25, . . . , 0.50}. The chosen value for the tolerance parameter was ε = 0.001. All algorithms were written in Matlab R2013a and the simulations ran on a HP ProBook notebook with 2.30Ghz Intel Core i5 processor, 4GB of RAM memory and Windows 7 Professional operational system. The obtained Root Mean Square Error (RMSE) values for the artificial datasets during the free simulation phase are reported in Table 1. In almost all cases contaminated with outliers the robust approaches consistently outperformed the traditional LS-SVR, except for the Artificial 2 dataset with 5% of outliers. In the scenarios without outliers, the robust models achieved performance closer to the traditional LS-SVR, except for the Artificial 2 dataset where the RMSE value for the WLS-SVR model was significantly lower that those achieved by the other methods. Comparing the two robust models, the WLS-SVR presented in general smaller RMSE values than RLS-SVR models in scenarios without outliers and with 5% of contamination. The only exception was in Artificial 1 dataset without outliers. The RLS-SVR performed better in scenarios with 10% e 20% of outliers in almost all datasets, except for the Artificial 3 with 20% of outliers. It is important to note that, despite the fact that the WLS-SVR and the RLSSVR are outlier-robust methods, this does not mean they are fully insensitive to outliers. For the Artificial 1 and 2 datasets, the RMSE obtained increased significantly when the contamination also increased. A good resilience to outliers was achieved only for Artificial 3 dataset, where WLS-SVR and RLS-SVR were less affected for the cases up to 10% of contamination. The predicted outputs for the artificial test data are illustrated in Figs. 1 to 3, where the effect of the incremental addition of outliers to the estimation data can be more easily perceived. Note that in the scenarios with higher rates of contamination, especially for the Artificial 1 and 2 datasets (Figs. 1c, 1d, 2c and 2d), the predicted outputs can become very different from the real ones. It should be also observed that, as expected, the performance of the conventional LS-SVR model deteriorates faster than those of the robust approaches. The obtained RMSE for the wing flutter dataset during the free simulation experiments are shown in Table 2. The RMSE value of the WLS-SVR model was considerable lower than ones with other approaches. For this dataset, LS-SVR

10

J. D. A. Santos, C. L. C. Mattos, G. A. Barreto

(a) 0% outliers.

(b) 5% outliers.

(c) 10% outliers.

(d) 20% outliers.

Fig. 1. Free simulation with Artificial 1 dataset.

and RLS-SVR outputs had the same behavior, because the RLS-SVR model stopped after a single iteration. The simulated test outputs are illustrated in Figs. 4a-4c, where we can see in Fig. 4b that the predicted outputs of the WLSSVR model followed the dynamical behavior of the system better than LS-SVR and RLS-SVR models, as can be seen in Figs. 4a and 4c, respectively. As a final remark, we can write down some words on the computational complexity of the algorithms. It was observed in the experiments that the training times of the RLS-SVR model were usually longer than the ones observed for the LS-SVR and WLS-SVR models. In fact, this was somewhat expected since the computational complexities of the LS-SVR, WLS-SVR and RLS-SVR models are O((N + 1)3 ), O(2(N + 1)3 ) and O(Tr (N + 1)3 ), respectively, where Tr is the total number of iterations of the RLS-SVR model.

Performance Evaluation of LS-SVR in Robust System Identification

(a) 0% outliers.

(b) 5% outliers.

(c) 10% outliers.

(d) 20% outliers.

11

Fig. 2. Free simulation with Artificial 2 dataset.

5

Conclusions

In this paper we carried out a comprehensive performance evaluation of two robust LS-SVR variants, namely the WLS-SVR and RLS-SVR models, applied to nonlinear dynamical system identification tasks under free simulation scenarios and in the presence of outliers. None of these models were evaluated in such difficult scenarios before. The experiments were conducted on three artificial datasets, contaminated with different rates of outliers in the outputs of the estimation data, and a realworld dataset, available in a benchmark identification database. Generally, the obtained results in free simulation with the robust models presented lower values of RMSE than ones obtained with traditional LS-SVR. Moreover, between the robust models, the RLS-SVR achieved in general the best results in the scenarios with higher rates of outliers. For scenarios without

12

J. D. A. Santos, C. L. C. Mattos, G. A. Barreto

(a) 0% outliers.

(b) 5% outliers.

(c) 10% outliers.

(d) 20% outliers.

Fig. 3. Free simulation with Artificial 3 dataset.

outliers and the real-world dataset, the WLS-SVR model outperformed the other methods. However, the results also showed that the WLS-SVR and the RLS-SVR models are not fully insensitive to the presence of outliers. Both methods presented considerable differences between some outlier-free and outlier corrupted scenarios. Depending on the application, such differences can make these approaches unfeasible to use in practice. In the future, we will continue to study and develop techniques to improve the robustness of LS-SVR algorithms applied to nonlinear dynamical system identification, besides to investigate some features of the computational complexity in RLS-SVR method.

Performance Evaluation of LS-SVR in Robust System Identification

(a) LS-SVR.

13

(b) WLS-SVR.

(c) RLS-SVR. Fig. 4. Free simulation with wing flutter dataset.

Acknowledgments The authors thank the financial support of FUNCAP (Funda¸c˜ao Cearense de Apoio ao Desenvolvimento Cient´ıfico e Tecnol´ogico), IFCE (Instituto Federal de Educa¸c˜ ao, Ciˆencia e Tecnologia do Cear´a) and NUTEC (N´ ucleo de Tecnologia Industrial do Cear´ a).

References 1. Cai, Y., Wang, H., Ye, X., Fan, Q.: A multiple-kernel lssvr method for separable nonlinear system identification. Journal of Control Theory and Applications 11(4), 651–655 (2013) 2. Chapelle, O.: Training a support vector machine in the primal. Neural Computation 19(5), 1155–1178 (2007)

14

J. D. A. Santos, C. L. C. Mattos, G. A. Barreto

3. Falck, T., Dreesen, P., De Brabanter, K., Pelckmans, K., De Moor, B., Suykens, J.A.: Least-squares support vector machines for the identification of wiener– hammerstein systems. Control Engineering Practice 20(11), 1165–1174 (2012) 4. Falck, T., Suykens, J.A., De Moor, B.: Robustness analysis for least squares kernel based regression: an optimization approach. In: Decision and Control, 2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on. pp. 6774–6779. IEEE (2009) 5. Huber, P.J., et al.: Robust estimation of a location parameter. The Annals of Mathematical Statistics 35(1), 73–101 (1964) 6. Khalil, H.M., El-Bardini, M.: Implementation of speed controller for rotary hydraulic motor based on LS-SVM. Expert Systems with Applications 38(11), 14249– 14256 (2011) 7. Kocijan, J., Girard, A., Banko, B., Murray-Smith, R.: Dynamic systems identification with Gaussian processes. Mathematical and Computer Modelling of Dynamical Systems 11(4), 411–424 (2005) 8. Liu, Y., Chen, J.: Correntropy-based kernel learning for nonlinear system identification with unknown noise: an industrial case study. In: Internationa Symposium on Dynamics and Control of Proccess Systems, 2013 10th India. pp. 361–366 (2013) 9. Liu, Y., Chen, J.: Correntropy kernel learning for nonlinear system identification with outliers. Industrial and Enginnering Chemistry Research pp. 1–13 (2013) 10. Ljung, L.: System Identification Theory for the User. 2nd edn. (1999) 11. Majhi, B., Panda, G.: Robust identification of nonlinear complex systems using low complexity ANN and particle swarm optimization technique. Expert Systems with Applications 38(1), 321–333 (2011) 12. Narendra, K.S., Parthasarathy, K.: Identification and control of dynamical systems using neural networks. Neural Networks, IEEE Transactions on 1(1), 4–27 (1990) 13. Rousseeum, P.J., Leroy, A.M.: Robust Regression and Outlier Detection. 1st edn. (1987) 14. Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: (ICML-1998) Proceedings of the 15th International Conference on Machine Learning. pp. 515–521. Morgan Kaufmann (1998) 15. Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B., Vandewalle, J.: Least Squares Support Vector Machines. World Scientific Publishing, 1st edn. (2002) 16. Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural processing letters 9(3), 293–300 (1999) 17. Suykens, J.A., De Brabanter, J., Lukas, L., Vandewalle, J.: Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1), 85–105 (2002) 18. Van Gestel, T., Suykens, J.A., Baestaens, D.E., Lambrechts, A., Lanckriet, G., Vandaele, B., De Moor, B., Vandewalle, J.: Financial time series prediction using least squares support vector machines within the evidence framework. Neural Networks, IEEE Transactions on 12(4), 809–821 (2001) 19. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer (1995) 20. Yang, X., Tan, L., He, L.: A robust least squares support vector machine for regression and classification with noise. Neurocomputing 140, 41–52 (2014) 21. Yuille, A.L., Rangarajan, A.: The concave-convex procedure. Neural Computation 15(4), 915–936 (2003)

Suggest Documents