Available online at www.sciencedirect.com
ScienceDirect Procedia Computer Science 48 (2015) 14 – 21
International Conference on Intelligent Computing, Communication & Convergence (ICCC-2014) (ICCC-2015) Conference Organized by Interscience Institute of Management and Technology, Bhubaneswar, Odisha, India
A Model Ranking Based Selective Ensemble Approach for Time Series Forecasting Ratnadip Adhikari*, Ghanshyam Verma, Ina Khandelwal Department of Computer Science and Engineering, the LNM Institute of Information Technology, Jaipur-302031, India
Abstract Time series analysis is a highly active research topic that encompasses various domains of science, engineering, and finance. A major challenge in this field is to obtain reasonably accurate forecasts of future data from analyzing the past records. A fruitful alternative to using a single forecasting technique is to combine the forecasts from several conceptually different models. Numerous research studies in literature strongly recommend this approach, due to the fact that a combination of multiple forecasts almost always substantially reduces the overall forecasting errors as well as outperforms the component models. In this paper, we propose an ensemble method that selectively combines some of the constituent forecasting models, instead of combining all of them. On each time series, the component models are successively ranked as per their past forecasting accuracies and then we combine the forecasts of a group of high ranked models. Empirical analysis is conducted with nine individual models and four real-world time series datasets. Results clearly show that our proposed ensemble mechanism achieves consistently better accuracies than all component models and other conventional forecasts combination schemes. © 2015 2014The TheAuthors. Authors.Published PublishedbybyElsevier ElsevierB.V. B.V. © This is an open access article under the CC BY-NC-ND license Selection and peer-review under responsibility of the Program Chairs of ICCC-2014. (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of scientific committee of International Conference on Computer, Communication and Convergence (ICCC 2015)
* Corresponding author. Tel.: +91-966-093-3276; E-mail address:
[email protected]
1877-0509 © 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of scientific committee of International Conference on Computer, Communication and Convergence (ICCC 2015) doi:10.1016/j.procs.2015.04.104
Ratnadip Adhikari et al. / Procedia Computer Science 48 (2015) 14 – 21 Keywords: Time series; Combination of forecasts; Forecasting accuracy; Box-Jenkins models; Neural networks; Support vector machines.
1. Introduction Time series analysis is a highly important and dynamic research domain, having numerous practical applications. Its primary objective is to develop a mathematical model that estimates the underlying data generation process, retaining the statistical properties of the series and then to forecast desired number of future observations through this model. Appropriate modeling and forecasting of a time series is a considerably difficult task, mainly due to several unintended characteristics, often associated with the series. These include nonstationarity, irregular fluctuations, seasonal and cyclical variations, deviations from the standard statistical specifications, and severe multicollinearity among the observations [1]. The most appropriate alternative is to combine the forecasts from several structurally different models, instead of adopting only one model. Forecasts combination is based on the rational ideology that no specific model alone can consistently achieve best forecasts for a class of time series, but multiple models in unison can provide a very close estimation of the actual data generation process [2]. A number of renowned research works in this domain have demonstrated that a combination of forecasts generally comes up with much better forecasting accuracy than each component model. Moreover, this approach also substantially reduces the risk associated with selecting a single individual forecasting technique [3, 4]. Throughout the past two decades, there has been an overwhelming amount of research on combining forecasts, mainly due to its outstanding potency of accuracy enhancement. As a result, a variety of combination techniques have been developed in literature [2, 3]. Most of them form a weighted linear combination of the component forecasts, the weights being determined from the past forecasting records of the participating models. Their range varies from the simple statistical techniques, e.g. simple average, trimmed mean, winsorized mean, median, etc. [5] to more advanced methods, e.g. the outperformance and optimal linear combination of forecasts [2, 6]. Recently, Adhikari and Agrawal [7] have comprehensively reviewed the performances of several linear forecasts combination techniques on nine real time series datasets. An important finding from the past as well as recent research is that the simple techniques of combining generally achieves considerably better accuracies than more complex schemes. We further notice that there has been little work on selecting the suitable models in the ensemble and as such, the existing works combines all component forecasts. However, it is obvious that not all models will produce good forecasts for the particular time series and so tactically discarding some of them can potentially improve the overall accuracy to a large extent. This observation is the primary motivation behind the present work. In this paper, we propose an ensemble methodology that combines the forecasts from some selected component models. The appropriate subset of forecasts to combine is selected through a ranking mechanism. At first, the models are successively ranked between one and the total number of models, so that a model with a comparatively smaller in-sample forecasting error gets a smaller, i.e. in fact a better rank and vice versa. Then, starting with the first rank, we consecutively select a predefined number of models and form a weighted linear combination of their forecasts. The weight to each model in this group is assigned to be inversely proportional to its in-sample forecasting error. In this manner, the proposed approach selectively combines the forecasts from a group of better performing models and discards the others. In order to check the precision and effectiveness of our approach, empirical analysis is carried out with nine individual forecasting models on four real time series datasets. The forecasting performance of the proposed ensemble is compared with those of the individual models as well as a number of other traditional linear combination techniques, through two popular error measures. The remainder of the paper is organized as follows. Section 2 describes various well-known linear forecasts combination techniques and Sect. 3 presents the proposed ensemble mechanism. Section 4 reports the empirical analysis and finally Sec. 5 concludes this paper. 2. The ensemble forecasting paradigm The most popular and widely used ensemble method is to form a linear combination of the constituent forecasts. Let us consider that Y
y1 , y2 , , yN
T
be the actual out-of-sample testing dataset of a time series and
15
16
Ratnadip Adhikari et al. / Procedia Computer Science 48 (2015) 14 – 21
ˆi Y
T
yˆ1i , yˆ2i , , yˆ Ni
i 1,2, , n . Then, a linear combination of these
be its forecast through the ith model
n forecasts is obtained as follows:
yˆk
w1 yˆk(1) w2 yˆk(2)
n
wn yˆk(n)
wi yˆk(i )
(1)
i 1
k 1,2, , N. Here, wi is the weight assigned to the ith forecasting model. Usually, the weights are assumed to be nonnegative, i.e.
wi
0, i and unbiased, i.e.
n
ˆ wi 1. The combined forecast vector for Y is given by Y
i 1
T
yˆ1, yˆ2 , , yˆN . Over
the years, various linear forecasts combination methods have been developed in literature on the basis of different weight assignment techniques. Some widely popular among them are briefly discussed here. The simple average is the most intuitive and easiest combination method that assigns equal weights to all component forecasts, so that wi 1 n , i 1,2, , n. Due to its virtues of remarkable accuracy, impartiality, and robustness, the simple average is often a favorable choice in combining forecasts [4–6]. The median, trimmed mean, and winsorized mean are other successful alternatives to it. A trimmed and winsorized mean both forms a simple average by fetching an equal number of α smallest and largest forecasts and either completely discarding them or setting them equal to the (α+1)th smallest and the (α+1)th largest forecasts, respectively [6]. The simple average and median are in fact particular cases of a trimmed mean, corresponding to no trimming and maximum possible trimming, respectively. In an Error Based (EB) method, the weights to the component models are assigned to be inversely proportional to their in-sample forecasting errors. Thus, a model with more error receives less weight and vice versa. Usually, the in-sample forecasting errors are measured through some total absolute error statistic, e.g. the Sum of Squared Error (SSE) [2, 7]. A Differential Weighting (DW) scheme is an alternative to the EB method that adaptively estimates the combining weights from the past forecasting records of the constituent models. Here, we use a popular DW method from the work of Winkler and Makridakis [8]. Its weighting scheme is as follows:
wi,t
wi,t
1
1
t 1 s t v
es
i
2
1
n
t 1
j 1
s t v
es
j
2
1
(2)
i 1,2, , n. Here, n is the number of models; t is the forecasting period; wi,t is the weight assigned to the ith model on the i basis of the data preceding the period t; et is the percentage forecast error at time t and 0,1 is a constant parameter. Following Winkler and Makridakis [8], in this study, we consider that β=0.7. In the Ordinary Least Squares (OLS) method, the component forecasts, together with a constant are used as the regression terms in the OLS regression and the weights are determined through minimizing the combined forecast SSE [5, 9, 10]. This method is more general in the sense that it omits the requirement of nonnegativity and unbiasedness, but includes the risk of getting negative weights, which are often insensible [5, 9]. In practical applications, the weights are determined by minimizing an in-sample combined forecast SSE. The outperformance method, proposed by Bunn [11] determines the combining weights from the number of times the corresponding models performed best in past in-sample forecasting trials. It considers each weight as the probability that the respective model will outperform the others, i.e. produce the least error in the next trial. It is a very successful robust nonparametric approach of combining forecasts [5].
17
Ratnadip Adhikari et al. / Procedia Computer Science 48 (2015) 14 – 21
3. The proposed forecasts combination methodology A major challenge faced in a forecasts combination is to select the appropriate component models. Ideally, an ensemble should neither include the models with reasonably bad forecasting accuracies nor discard any potentially good model. Evaluating the out-of-sample forecasting potency of a model in advance is very difficult and as such, there has been considerably limited works in this direction [12]. Most of the existing ensemble schemes consider a group of component models and combine all the obtained forecasts. But, obviously, such an ensemble has a genuine risk of including some models with reasonably poor performances, which ultimately deteriorate the overall combined forecasting accuracy. In this study, we propose an approach that combines the forecasts from a class of good models, selected through a ranking mechanism that is based on past forecasting records of the component models. The models with potentially poor accuracies get filtered out through this ranking technique. The following steps are executed in our proposed ensemble method. Step 1. We divide the original time series Y
Ytr
y1 , y2 , , yNtr
testing dataset Yts and Nin
Nts
T
y1 , y2 , , yN
, the in-sample validation dataset Yvd T
yNin 1 , yNin 2 , , yNin
Nts
, so that Nin
T
into the in-sample training dataset
yNtr 1 , yNtr 2 , , yNtr
T Nvd
, and the out-of-sample
Ntr Nvd is the size of the total in-sample dataset
N.
ˆ tsi Step 2. Let, we have n component forecasting models and obtain Y
yˆNiin 1 , yˆNiin 2 , , yˆNiin
T Nts
as the forecast
th
of Yts through the i model.
ˆ vd Step 3. We implement each model on Ytr and use it to predict Yvd . Let, Y i
yˆNitr 1 , yˆ Nitr 2 , , yˆ Nitr
T Nvd
be the
prediction of Yvd through the ith model. Step 4. We find the in-sample forecasting error of each model through some suitable error measure. The Mean Absolute Error (MAE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE), are three widely popular error statistics, which are defined as follows [13]:
1 N
ˆ MAE X, X where, X
et
N
ˆ et , MSE X, X
t 1
x1 , x2 , , xN
T
ˆ , X
1 N
N
ˆ et2 , MAPE X, X
t 1 T
xˆ1 , xˆ2 , , xˆN
1 N
N t 1
et xt
100 ,
are respectively the actual and forecasted datasets and
xt xˆt is the forecasting error at time t. In the present study, we adopt the MSE to find the in-sample
forecasting errors of the component models. Step 5. Based on the obtained in-sample forecasting errors, we assign a score to each component model as i
ˆ vdi , 1 MSE Yvd , Y
i 1,2, , n. The scores are assigned to be inversely proportional to the respective
errors, so that a model with comparatively smaller in-sample error receives more score and vice versa. Step 6. We assign a rank ri
1,2, , n to the ith model, on the basis of its score, so that ri
rj if i, j 1,2, , n. The minimum, i.e. the best rank is 1 and the maximum, i.e. the worst rank is at most n.
Step 7. We choose a number nr so that 1 nr
n and let I
i
j
i1 , i2 , , inr be the index set of the nr component
models, whose ranks are in 1, nr . So, we select a subgroup of nr smallest ranked component models. Step 8. Finally, we obtain the weighted linear combination of these selected nr component forecasts as follows:
yˆk
wi1 yˆki1
wi2 yˆki2
i
winr yˆk nr
wi yˆki i I
,
(3)
18
Ratnadip Adhikari et al. / Procedia Computer Science 48 (2015) 14 – 21
k 1,2, , N. Here, wik
nr ik
k 1
ik
nr
is the normalized weight to the selected component model, so that
k 1
wik
1.
In the described ensemble scheme, the selection of the appropriate validation set, i.e. the parameter Nvd and the group size nr is very important. The validation set should reflect the characteristics of the testing dataset that is practically unknown in advance. As such, in this study, we set Nvd to be equal to Nts , the size of the testing dataset. Also, the group size nr should be appropriately selected so that it is neither too small nor too large. 4. Empirical analysis To test the effectiveness of the proposed ensemble approach, we have carried out experiments on MATLAB for four important real time series datasets. The individual models are selected from the following widely popular classes: random walk [14], Box-Jenkins [13, 14], Artificial Neural Network (ANN) [13, 14], and Support Vector Machine (SVM) [15, 16]. The random walk and Box-Jenkins are linear models in the sense that each future value is assumed to be a linear function of past observations, whereas ANN and SVM have recognized capabilities of learning nonlinear structure in a time series. The subclass Box-Jenkins models, viz. Autoregressive Integrated Moving Average (ARIMA) [13, 14] and Seasonal ARIMA (SARIMA) [13] are used in this work and are fitted through the default ARIMA class of the Econometric toolbox of MATLAB. These two models are commonly expressed as ARIMA(p, d, q) and SARIMA(p, d, q)×(P, D, Q)s, where the parameters (p, P), (d, D), (q, Q) respectively denote the autoregressive, degree of differencing, and moving average processes, and “s” is the period of seasonality. The appropriate parameters are determined through the Box-Jenkins model building methodology [14]. For ANN, we have considered three different variations, viz. Feedforward ANN (FANN) [2, 14], Elman ANN (EANN) [2, 17], and Generalized Regression Neural Network (GRNN) [1], which are implemented through the default neural network toolbox [19] of MATLAB. Both the iterative (ITER) and direct (DIR) approaches of ANN forecasting are used here. The former approach predicts a single observation at a time, whereas the later predicts all the future values at one step [18]. The appropriate ANN structure i×h×o, consisting of the numbers of input nodes (i), hidden nodes (h), and output nodes (o) is identified through in-sample validations. For SVM modeling, we have used the Least Squares SVM (LS-SVM) framework, developed by Suykens and Vandewalle [16] and implemented through the LS-SVMlab toolbox [20], considering the Radial Basis Function (RBF) kernel. The optimal values of the parameters, viz. the regularization constant (C) and the RBF tuning parameter (σ) are selected through a 10-fold crossvalidation, by setting their search ranges as 10 5 , 105 and 2 10 , 210 , respectively.
Monthly river flow
1500
(a)
1000
500
0
0
100
200
300
400
500
600
Monthly employments
Four real time series are used in this study, which are: (1) River flow: contains the monthly flow in cms of the Clearwater river at Kamiah, Idaho, USA from 1911 to 1965 [21], (2) Employment: contains the monthly Govt. employment of USA by statistical area in thousands of persons from January, 1990 to June, 2014 [21], (3) AUD-INR exchange rate: contains monthly Australian dollar (AUD) to Indian Rupees (INR) exchange rates from January, 1993 to July, 2014 [22], (4) Facebook: contains the daily closing stock prices of Facebook from 18 May, 2012 to 3 September, 2014 [23]. The time plots of these series are depicted in Fig. 1 and the information regarding their sizes, types, and optimal model parameters are presented in Table 1. 300
(b)
280 260 240 220 0
49
98
147
196
245
294
19
Exchange rate
60
Closing stock price
Ratnadip Adhikari et al. / Procedia Computer Science 48 (2015) 14 – 21
(c)
50 40 30 20 0
37
74
111
148
185
222
80
(d)
60 40 20 0
259
64
128 192 256 320 384 448 512 576
Fig. 1. The time plots of the series: (a) River flow, (b) Employment, (c) AUD-INR exchange rate, (d) Facebook Table 1. Modeling information of the four time series datasets Information
River flow
Employment
AUD-INR exchange rate
Facebook
Size (total, testing)
(600, 100)
(294, 61)
(259, 56)
(576, 92)
Type
Stationary, non-seasonal
Monthly seasonal
Non-stationary, non-seasonal
Non-stationary, non-seasonal
12
Box-Jenkins model
ARIMA(16, 0, 0)
SARIMA(0, 1, 1)×(0, 1, 1)
ARIMA(2, 0, 0)
ARIMA(1, 0, 0)
ITER-FANN
8×6×1
9×3×1
5×1×1
9×11×1
DIR-FANN
8×6×100
5×5×61
7×7×56
2×2×92
ITER-EANN
8×15×1
7×9×1
6×15×1
13×15×1
DIR-EANN
8×15×100
6×5×61
1×12×56
3×11×92
ITER-GRNN
11×5×1
1×1×1
4×4×1
3×1×1
DIR-GRNN
12×5×100
13×1×61
2×1×56
2×1×92
LS-SVM (C, σ)
(2.413, 742.785)
(966.835, 336.116)
(0.732, 1.668)
(4984.848, 0.527)
In Table 2, we present the obtained forecasting results through all methods, in terms of MAE and MSE. In this work, we take the group size nr 5 , so that the first five high-ranked models out of nine are considered for combining and the remaining four low-ranked models are discarded. Table 2. Forecasting results through all methods River flow3
Forecasting errors
Employment
AUD-INR exchange rate
Facebook
MAE
MSE
MAE
MSE
MAE
MSE
MAE
MSE
Random walk
1.755
7.059
4.553
32.114
1.336
3.042
1.671
4.349
Box-Jenkins
1.260
2.606
4.792
31.413
1.307
2.424
1.824
4.671
ITER-FANN
0.913
1.787
4.580
32.762
1.245
2.640
1.851
5.358
DIR-FANN
1.137
1.982
5.025
35.720
1.760
4.732
1.987
6.301
ITER-EANN
1.036
2.189
4.788
34.732
1.342
2.747
1.921
5.772
DIR-EANN
1.585
4.257
4.531
30.200
1.415
3.157
1.855
5.430
ITER-GRNN
1.057
3.250
6.157
56.740
1.437
3.267
1.781
4.736
DIR-GRNN
1.947
4.684
5.431
42.850
1.299
2.525
2.233
6.899
LS-SVM
1.040
1.960
5.058
36.797
1.307
2.875
1.943
5.930
Simple average
0.805
1.656
4.067
25.146
0.798
1.057
0.983
1.678
Trimmed mean1
0.793
1.584
4.308
27.803
0.806
1.062
1.072
1.886
Winsorized mean
0.798
1.610
4.268
27.386
0.799
1.050
1.056
1.907
Median
0.818
1.615
4.379
28.408
0.909
1.263
1.170
2.056
EB2
0.778
1.476
4.347
28.336
0.822
1.073
0.989
1.747
20
Ratnadip Adhikari et al. / Procedia Computer Science 48 (2015) 14 – 21 DW
0.771
1.555
4.173
26.229
0.805
1.057
0.985
1.703
OLS
1.128
2.448
4.238
27.013
1.346
2.908
0.986
1.730
Outperformance
0.854
1.555
4.438
29.778
0.915
1.255
1.113
2.142
Proposed
0.766
1.437
3.715
21.154
0.745
0.950
0.971
1
2
3
1.640
2
4
20% trimming is used; the weight to each model is proportional to in-sample MSE; original MAE=MAE×10 , original MSE=MSE×10 .
From Table 2, it can be clearly seen that no individual model could attain the uniformly best accuracies for all time series and that the combination methods has achieved overall better forecasting results than the component models. Further, we can see that the proposed method has achieved the least errors and hence the best accuracies for each time series. In Table 3, we present the assigned ranks to the component models through our proposed ensemble scheme. In Fig. 2, we depict the graphs of the actual testing datasets and their forecasts through the proposed method. Table 3. The assigned ranks to the component models through the proposed ensemble Models
River flow
Employment
AUD-INR exchange rate
Facebook
Random walk
9
4
1
1
Box-Jenkins
5
8
4
3
ITER-FANN
1
1
3
2
DIR-FANN
2
7
8
8
ITER-EANN
6
2
9
9
DIR-EANN
7
6
2
7
ITER-GRNN
8
5
6
6
DIR-GRNN
4
9
5
4
LS-SVM
3
3
7
5
1500
270
(a)
(b) 260
1000
250 500 0
240 0
20
40
60
80
100
230
10
20
30
40
50
60
80
60
(c)
(d)
50
70
40
60
30
0
0
10
20
30
40
50
50
0
20
40
60
80
Fig. 2. Testing set and its forecast through the proposed method for: (a) River flow, (b) Employment, (c) AUD-INR exchange rate, (d) Facebook
Ratnadip Adhikari et al. / Procedia Computer Science 48 (2015) 14 – 21
Fig. 2 visually depicts the forecasting precision of the proposed selective ensemble method. The actual testing dataset and its forecast through the proposed method are depicted through the solid and dotted lines, respectively. The remarkable closeness between the actual and forecasted observations is clearly evident in Fig. 2. 5. Conclusions Obtaining reasonably precise forecasts for time series datasets is a major challenge in many domains of science, engineering, and finance. Numerous research evidences show that combining forecasts from multiple structurally different models substantially improves the forecasting accuracy as well as often outperforms all component models. In this study, an ensemble methodology is proposed that ranks the component models on the basis of their in-sample absolute errors and then selectively combines the forecasts from a predefined number of high ranked models. Thus, only a group of better performing models are combined and others are discarded. Empirical analysis is conducted with nine individual models on four real-world time series datasets. Obtained results clearly demonstrate that the proposed method has attained consistently better accuracies than all component models as well as several other popular forecasts combination mechanisms. As such, this study justifies the superiority of selective ensemble over combining all available forecasts. In future works, the proposed combination approach can be further explored with other varieties of forecasting models as well as more diverse time series datasets. References 1. Gheyas, I. A., Smith, L. S. (2011). A novel neural network ensemble architecture for time series forecasting. Neurocomputing 74 (18), 38553864. 2. Lemke, C., Gabrys, B. (2010). Meta-learning for time series forecasting and forecast combination. Neurocomputing 73, 2006–2016. 3. Terui, N., van Dijk, H. K. (2002). Combined forecasts from linear and nonlinear time series models. International Journal of Forecasting 18, 421-438. 4. Gooijer, J. G., Hyndman, R. J. (2006). 25 years of time series forecasting. International Journal of Forecasting 22, 443-473. 5. De Menezes, L. M., Bunn, D. W., Taylor, J. W. (2000). Review of guidelines for the use of combined forecasts. European Journal of Operational Research 120 (1), 190-204. 6. Jose, V. R. R., Winkler, R. L. (2008). Simple robust averages of forecasts: Some empirical results. International Journal of Forecasting 24 (1), 163-169. 7. Adhikari, R., Agrawal, R. K. (2012). Performance evaluation of weight selection schemes for linear combination of multiple forecasts. Artificial Intelligence Review 1-20. 8. Winkler, R. L., Makridakis, S. (1983): The combination of forecasts. Journal of the Royal Statistical Society A 146 (2), 150-157. 9. Granger, C. W. J., Ramanathan, R. (1984): Improved Methods of combining forecasts. International Journal of Forecasting 3, 197-204. 10. Frietas, P. S., Rodrigues, A. J. (2006): Model combination in neural-based forecasting. European Journal of Operational Research 173, 801-814. 11. Bunn, D. (1975): A Bayesian approach to the linear combination of forecasts. Operational Research Quarterly 26 (2), 325-329. 12. Che, J. (2014). Optimal sub-models selection algorithm for combination forecasting model. Neurocomputing. doi: http://dx.doi.org/10.1016/j.neucom.2014.09.028. 13. Hamzaçebi, C. (2008). Improving artificial neural networks' performance in seasonal time series forecasting. Information Sciences 178, 4550-4559. 14. Zhang, G.P. (2003). Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 50, 159-175. 15. Vapnik, V. (1995). The nature of statistical learning theory. Springer-Verlag, New York. 16. Suykens, J. A. K., Vandewalle, J. (1999). Least squares support vector machines classifiers. Neural Processing Letters 9 (3), 293-300. 17. Zhao, J., Zhu, X., Wang, W., Liu, Y. (2013). Extended Kalman filter-based Elman networks for industrial time series prediction with GPU acceleration. Neurocomputing 118, 215-224. 18. Hamzaçebi, C., Akay, D., Kutay, F. (2009). Comparison of direct and iterative artificial neural network forecast approaches in multi-periodic time series forecasting. Expert Systems with Applications 36 (2), 3839-3844. 19. Demuth, H., Beale, M., Hagan, M. (2010). Neural network toolbox user's guide. The MathWorks, Natic, MA, USA. 20. Pelckmans, K., Suykens, J. A. K., Van Gestel, T., De Brabanter, J., Lukas, L., Hamers, B., et al. (2003). LS-SVMlab toolbox user’s guide. Pattern recognition letters 24, 659-675. 21. Data Market (2014). http://datamarket.com 22. Pacific exchange Rate Service (2014). http://fx.sauder.ubc.ca/data.html 23. Yahoo! Finance (2014). http://finance.yahoo.com
21