a multiscale modeling approach incorporating arima ... - Springer Link

1 downloads 195 Views 292KB Size Report
Apr 20, 2013 - model is only described by EUR/USD time series. The forecasts and forecasting performances of GBP/USD and USD/CAD are reported at the ...
J Syst Sci Complex (2014) 27: 225–236

A MULTISCALE MODELING APPROACH INCORPORATING ARIMA AND ANNS FOR FINANCIAL MARKET VOLATILITY FORECASTING∗ XIAO Yi · XIAO Jin · LIU John · WANG Shouyang

DOI: 10.1007/s11424-014-3305-4 Received: 1 April 2012 / Revised: 20 April 2013 c The Editorial Office of JSSC & Springer-Verlag Berlin Heidelberg 2014 Abstract The financial market volatility forecasting is regarded as a challenging task because of irregularity, high fluctuation, and noise. In this study, a multiscale ensemble forecasting model is proposed. The original financial series are decomposed firstly different scale components (i.e., approximation and details) using the maximum overlap discrete wavelet transform (MODWT). The approximation is predicted by a hybrid forecasting model that combines autoregressive integrated moving average (ARIMA) with feedforward neural network (FNN). ARIMA model is used to generate a linear forecast, and then FNN is developed as a tool for nonlinear pattern recognition to correct the estimation error in ARIMA forecast. Moreover, details are predicted by Elman neural networks. Three weekly exchange rates data are collected to establish and validate the forecasting model. Empirical results demonstrate consistent better performance of the proposed approach. Keywords ARIMA model, financial market volatility forecasting, multiscale modeling approach, neural network, wavelet transform. XIAO Yi School of Information Management, Central China Normal University, Wuhan 430079, China. Email: [email protected]. XIAO Jin Business School, Sichuan University, Chengdu 610064, China. LIU John (Corresponding author) Center for Transport Trade and Financial Studies, City University of Hong Kong, Hong Kong, China. WANG Shouyang Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China. ∗ This research is supported by the Humanities and Social Sciences Youth Foundation of the Ministry of Education of PR of China under Grant No. 11YJC870028, the Selfdetermined Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE under Grant No. CCNU13F030, China Postdoctoral Science Foundation under Grant No. 2013M530753 and National Science Foundation of China under Grant No. 71390335.  This paper was recommended for publication by Guest Editor ZHANG Xun.

226

1

XIAO YI, et al.

Introduction

The financial market is a complex and continuously evolving dynamic market with high volatility and noise. Due to its irregularity, financial market forecasting is regarded as a rather challenging task. Despite these difficulties, since the seminal work of Meese and Rogoff[1] , financial implications of accurate prediction of financial markets movements have motivated researchers and practitioners to deploy a variety of modeling methods based on macroeconomic fundamentals and purely statistical models[2, 3] . It is proven that traditional econometric and time series techniques can not reliably outperform the simplest random walk[4] , i.e., when market prices wander in a purely random and unpredictable way. This has encouraged academic researchers and business practitioners to develop more accurate forecasting models. As a result models using artificial intelligence such as artificial neural networks (ANNs) techniques have been recognized as more useful than conventional statistical forecasting models[5] . Many neural networks techniques have been widely used in financial time series forecasting and good results have been obtained[6–17] . Unfortunately, more and more researchers have realized that only selecting a single neuralnetwork model with the best performance may lead to loss of potentially valuable information contained in other neural-network models that may have slightly weaker performances. Therefore, some different learning strategies such as combined/ensemble learning and meta-learning have been presented[18] . Although there are many studies on ensemble forecasting, we find that there are three main problems. The various complex behaviors such as nonlinearity, non-stationarity, high volatility and noise inherent in original financial time series result in the difficulty in forecasting (Problem I). It is difficult to decide the suitable models matched different components (Problem II). To solve the above two problems, a three-stage multiscale ensemble modeling approach for forecasting financial time series is proposed. In the first stage, we adopt a multiresolution analysis of the level financial time series using the maximum overlap discrete wavelet transform (MODWT). The original financial time series is decomposed using MODWT at the chosen decomposition level two parts: The approximation series and details series. In the second stage, the former is predicted by a hybrid forecasting model that combines autoregressive integrated moving average (ARIMA) with feedforward neural networks (FNN). Details are predicted by various Elman networks. In the final stage, the final forecast is obtained by mixing above two forecasts. The rest of this study is organized as follows. Section 2 describes the building process of the multiscale ensemble forecasting model in detail; exchange rates time series between the US dollar and three other currencies are used for testing and comparing the forecasting performance with other methods in terms of all kinds of evaluation criteria in Section 3; and finally, some concluding remarks are given in Section 4.

MULTISCALE APPROACH FOR VOLATILITY FORECASTING

2

227

Multiscale Modeling Approach Incorporating ARIMA and ANNs

2.1

Wavelet Transform

Wavelet is a type of transformation that retains both time and frequency information of the signal. A partial DWT can be implemented if N is a multiple of 2M . Moreover, an additive decomposition of a time series Xt (t = 0, 1, · · · , N − 1) can formulated by the partial DWT where N is an integer multiple of 2M . The decomposed result of the original time series Xt , a hierarchical structure of details (high frequencies) and approximations (low frequencies) at set level, is defined as follows: Xt = d1 + d2 + · · · + dM + aM , (1) where aM (approximation series) represents the underlying smooth behavior of the original series at coarse scale M , di (detail series) represents the coarse scale deviations from the smooth behavior, and d1 , d2 , · · · , dM describe progressively finer scale deviations from the smooth behavior. An improved version of the traditional DWT, maximal overlap discrete wavelet transform (MODWT)[19] , carries out the same filtering steps as the standard DWT but does not subsample (decimate by 2) and therefore the number of scaling and wavelet coefficients at every level of the transform is the same as the number of sample observations. Additionally, for the MODWT, above mentioned constraint (N must be a multiple of 2M ) on N can be released. 2.2

WT-Based Time Series Reconstructing

The multiscale decomposition of the MODWT enables us to use a different approach to reconstruct time series. The principle of this approach is to start with the smooth approximation aM and reconstruct the lost details near the jumps that have been smoothed out, by using the information conveyed in the detail series dM . Suppose that the original time series Xt can be constructed additively by Xt = St + ut , t = 0, 1, · · · , N − 1,

(2)

where St is an unknown deterministic trend component, ut is a zero mean stochastic process and N is the total number of observations. The trend component of the time series can be separated from the stochastic component by the wavelet multiscaling approach. The reason behind is that the trend can be associated with a smooth slowly varying dynamic on large scale (low frequency) while the small scale (high frequency) parts of the time series may still be purely stochastic. Using the level M partial multiresolution analysis, the deterministic trend component will be captured mainly by the approximation series of the M th level whereas the detail series, which represent fine-scale deviations from the smooth trend, can then be extracted by details from 1 to M levels. 2.3

Forecasting of Trend Component of Time Series

Several studies find evidence of long-range dependence in financial market and their volatility . The original financial time series is composed of various frequency components so that it [20]

228

XIAO YI, et al.

is very difficult to obtain the trend component which is crucial for trend analysis and investment decision. For this problem, we adopt a new analysis based on the wavelet multiscaling theory, MODWT, to separate long trend from short volatility. A linear correlation structure is assumed among the data in ARIMA model, and therefore, it cannot capture the nonlinear patterns. The approximation of linear models to the problem in the real world is not always satisfactory because the real world systems are often nonlinear[21] . It is difficult to know completely the characteristics of the trend component in the financial time series. ANN models have been suggested as an alternative to time series forecasting because of their flexible nonlinear modeling capability. Therefore, the hybrid model integrating ARIMA with FNN becomes a reasonable practice to improve the forecasting accuracy in this study. 2.4

The Hybrid Forecasting Method Integrating ARIMA with FNN

Practical time series are rarely pure linear or nonlinear. ARIMA is one of the linear models that can capture the linear characteristics of a time series, while FNN based on kernel estimation theory is one of the general function approximators with strong capability of modeling nonlinearity. Therefore, in this study we propose a hybrid model integrating ARIMA and FNN in an adaptive manner for economic time series forecasting. It may be reasonable to suppose that a time series is composed of a linear autocorrelation structure and a nonlinear component: yt = L t + N t ,

(3)

where yt is actual value, Lt and Nt denote linear component and the nonlinear component, respectively. According to Equation (3), the two components have to be estimated from the data to obtain the final forecasting result. In the first phase, an ARIMA model is used to extract the linear component of time series. By comparing the actual value yt of the time series and the  t of linear component, we can obtain a series of residuals, which is defined as forecast value L et :  t. et = yt − L (4) A linear model is not sufficient if there are still linear correlation structures left in the residuals. In this phase, an FNN model is used to model the above nonlinear time series. The nonlinear time series generated previously is regarded as the inputs of FNN, and then the trained FNN is used to generate a series of forecasts of nonlinear components of time series. With m input nodes, the FNN model for the residuals will be described as et = ϕ(et−1 , et−2 , · · · , et−m , v) + ξt ,

(5)

where v is the parameter vector, ϕ is a function determined by the FNN structure and connection weights, and ξt is the random error. In order to obtain the synergetic forecasting results, we only need to integrate the forecasts of linear and nonlinear components of time series. Thus, the final forecast yt can be calculated

MULTISCALE APPROACH FOR VOLATILITY FORECASTING

229

as follows: t + N t , yt = L

(6)

t is the forecasting value of nonlinear  t is the forecasting result of linear component and N where L component. 2.5

Forecasting of Stochastic Component of Time Series

Using the level M -partial multiresolution analysis, the stochastic component will be captured by the detail series from 1 to M levels. The stochastic component is composed of high frequency parts of the financial time series. Since the higher frequency oscillations occur in the detail parts, Elman network fit seems to be more appropriate for predicting those parts.

3

Empirical Study

To save space, the modeling and forecasting procedure of the proposed multiscale ensemble model is only described by EUR/USD time series. The forecasts and forecasting performances of GBP/USD and USD/CAD are reported at the end of the paper. 3.1

Data Preparation

2 1 0

0 0.05 0 −0.05 0 0.05 0 −0.05 0 0.05 0 −0.05 0 0.1 0 −0.1 0 2 1 0 0

100

200

300

400

500

600

700

800

900

1000

100

200

300

400

500

600

700

800

900

1000

100

200

300

400

500

600

700

800

900

1000

100

200

300

400

500

600

700

800

900

1000

100

200

300

400

500

600

700

800

900

1000

100

200

300

400 500 600 observations

700

800

900

1000

a4

d4

d3

d2

d1

GBPUSD

The foreign exchange rates data used in this paper are weekly observations obtained from Wind database (http://www.wind.com.cn). To ensure that the application is sufficiently robust, they consist of the US dollar exchange rate against each of the currencies (EUR, GBP, and CAD) studied in this paper. The entire data set covers the period from 1992 to 2011 (1024 observations). The weekly data from 1992 to 2006 are used as the training set for modeling, and the remaining data from 2007 to 2011 as the test set for model verification and comparison. Note that the initial data set is split into three parts for modeling FNN of the hybrid model.

Figure 1 4-level decomposition of EUR/USD by MODWT

230 3.2

XIAO YI, et al.

Wavelet Transform

For wavelet analysis, the model specification and parameters are determined using trial and error method. The weekly exchange rates data of EUR/USD, GBP/USD, and USD/CAD are decomposed on level M = 4 using the db5 wavelet filter and providing a relatively smooth trend of the data with relatively few ripples. By comparing other wavelet filters, we find that the db5 filter is more appropriate to depict periodicities in the three exchange rates series. The 4-level decomposition of EUR/USD by MODWT is illustrated in Figure 1. 3.3

Performance Measures

To assess the ensemble prediction model, the forecasts are compared with the true realizations. The MAE (mean absolute error), MAPE (mean absolute percentage error), and RMSE (root mean squared error) are used for performance measures. From the business point of view, profits or returns are more important than conventional fit measurements. In exchange rate forecasting, improved decisions often depend on correct forecasting directions or turning points between actual and predicted values (Ti and Ti ). The ability to forecast movement direction or turning points can be measured by a statistic of directional change (DC), which can be expressed in percentage as  DC =

 N −1  1  (Ti+1 − Ti )(Ti+1 − Ti )  0 × 100%, N − 1 i=1

(7)

  is the prediction at time t+1. (Ti+1 −Ti )(Ti+1 −Ti )  where Ti is the actual value at time t, Ti+1 0 is a logical expression. However, the real aim of forecasting is to obtain profits based on prediction results. Here the return rate is introduced as an important evaluation indicator, calculated according to the simple principle, ignoring the friction costs.   P M R = (AR − IR) × 100%, (8) N

where M R is the P periods excess return rate relative to the tested exchange rate, AR is the amount of the return rate obtained on the entire period of testing set, IR is the return rate of the tested exchange rate on the entire period of testing set, and N is the number of testing periods. For convenient computing, we assume that one lot can be bought or sold. It is worth noting that computation of M R is based on a simple trading strategy ignoring transaction costs, as in the following:  If (Ti+1 − Ti ) > 0, then “buy”, else “sell”. 3.4 3.4.1

Forecasting Trend Series ARIMA Modeling

ARIMA model is used to model the linear component of trend series. We found that the best fitted model was an autoregressive integrated moving average model of order 1, i.e., ARIMA(1,1,1) by the AIC (Akaike Information Criterion) and BIC (Bayesian Information

MULTISCALE APPROACH FOR VOLATILITY FORECASTING

231

Criterion) minimization principle. Further, the trained ARIMA model was utilized to forecast the weekly exchange rates data from 4 January 2002 to 30 December 2011 by 22 May 1992 to 28 December 2001 (502 observations). As it is shown in Figure 2(a) that the forecasting results of ARIMA(1,1,1) have some biases compared with the actual values. These biases are mainly due to the limitations of the linear modeling of ARIMA model. Thus, we can conclude that ARIMA model is generally not suitable to identify and explore the nonlinear pattern of exchange rates time series. 1.6 Actual Forecast

1.55 1.5 1.45 1.4 1.35 1.3 1.25

0

50

100

150

200

250

time

(a) 0.02 Actual Forecast

0.015 0.01 0.005 0 −0.005 −0.01 −0.015 −0.02 0

50

100

150 time

(b)

200

250

232

XIAO YI, et al.

1.6 Actual Forecast

1.55 1.5 1.45 1.4 1.35 1.3 1.25

0

50

100

150

200

250

time

(c)

Figure 2 The forecasting results of 4-level trend series of EUR/USD

3.4.2

Feedforward Neural Network Modeling

In this section, FNN is trained in Matlab software according to the residuals of linear model. Further, the trained FNN model was utilized to forecast the weekly exchange rates data from 5 January 2007 to 30 December 2011 by 4 January 2002 to 29 December 2006 (261 observations). The forecasting results of the residuals for test set are shown in Figure 2(b). The results indicate that FNN is able to capture the nonlinear pattern of the exchange rates time series to provide good prediction performances of the weekly fluctuation of exchange rates. 3.4.3

Integrating Forecast Based on ARIMA Model and FNN

After getting the forecasting results of linear component of exchange rates time series by ARIMA(1,1,1) and the results of nonlinear component by FNN, we only need to integrate the two forecast time series to get the final forecasting results. The forecast values of hybrid model in test set are shown in Figure 2(c). The verification results suggest that the prediction results from the proposed hybrid model can match the observations more reasonably. Table 1 lists the simulation performance of the ARIMA model, FNN, and hybrid model in test set. Table 1 Comparison of average performance of different models over 20 runs Model

MAE*10−3

MAPE*10−3

RMSE*10−3

DC(%)

MR(%)

FNN

7.947

5.479

10.638

57.49

95.36

ARIMA model

0.407

0.296

0.572

97.38

207.73

Hybrid model

0.267

0.197

0.386

98.84

211.94

It can be seen from Table 2 that: (a) According to all indices, the forecasting performance of ARIMA model is better than that of FNN. The reason may be that in the trend series a4 , the dominant factor is linear components; (b) The prediction performance of hybrid model is better

MULTISCALE APPROACH FOR VOLATILITY FORECASTING

233

than that of ARIMA model and FNN, which demonstrates that the trend series include both nonlinear components and linear components. Therefore, it is not suitable for its forecasting to utilize only ARIMA or FNN. 3.5

Forecasting Stochastic Series

The stochastic series is composed of high frequency parts of the exchange rates time series, which are predicted using Elman networks in this empirical case. The forecast of 4-level stochastic series d1, d2, d3, and d4 of EUR/USD in test set are shown in Figure 3(a)–(d), respectively. Actual Forecast

0.04 0.02 0 −0.02 −0.04 0

50

100

150

200

250

time

(a) 0.04

Actual Forecast

0.02 0 −0.02 −0.04 0

50

100

150

200

250

time

(b) Actual Forecast

0.04 0.02 0 −0.02 −0.04 0

50

100

150

200

250

time

(c) 0.1

Actual Forecast

0.05 0 −0.05 −0.1 0

50

100

150

200

250

time

(d)

Figure 3 The forecasting results of 4-level stochastic series of EUR/USD

234 3.6

XIAO YI, et al.

The Final Forecast

The final forecasting result sums up forecasts of the trend series and stochastic series. The final forecast of EUR/USD is shown in Figure 4. The maximal error is 2.84% and mean absolute percentage error only is 0.58%, which verify the validity of the multiscale ensemble approach. 1.6 Actual Forecast

1.55 1.5

EUR/USD

1.45 1.4 1.35 1.3 1.25 1.2 1.15

0

50

100

150

200

250

time

Figure 4 The final forecast of EUR/USD by the multiscale ensemble model Table 2 Comparison of performances averaging each model over 20 runs Currencies

EUR/USD

Models MAE*10−3

MAPE*10−3

RMSE*10−3

DC(%)

MR(%)

MLP[12]

2.2492

1.4691

2.5854

51.37

40.79

RBFN[12]

1.5683

1.0894

1.9526

58.85

68.86

0.8638

0.6274

1.1769

83.87

524.74

Proposed model

0.8362*

0.5739*

1.1484*

86.31*

536.44*

MLP[12]

2.4648

1.4483

3.3479

52.46

2.71

2.0629

1.1934

2.7696

55.69

11.38

0.7407

0.4896

1.0513

84.86

469.96

0.7294*

0.4361*

0.9736*

88.39*

507.39*

1.4862

1.3875

2.0587

52.27

45.95

1.3374

1.2847

1.8371

58.78

55.21

Yu[18]

1.1873

1.0907

1.5375

70.05

414.16

Proposed model

1.1375*

1.0739*

1.4953*

70.87*

428.48*

Yu

GBP/USD

[18]

RBFN Yu

[12]

[18]

Proposed model USD/CAD

Performances

[12]

MLP

RBFN

[12]

MLP: multi-layer perceptron; RBFN: radial basis function network; * denotes the optimal result.

MULTISCALE APPROACH FOR VOLATILITY FORECASTING

3.7

235

Forecasting Performance

To verify the performance of the proposed multiscale ensemble model, it is compared to other methods presented in the literatures. Simulation results in Table 2 suggest that the proposed multiscale ensemble model with wavelet-based strongly outperforms the other methods in EUR/USD, GBP/USD, and USD/CAD. Moreover, the adopted trading strategy generates optimal one-step-ahead forecasts of returns using multiscale ensemble model and overcomes the buy-and-hold strategy in three cases where trends are detected.

4

Conclusions

In this study we propose a model that can provide the most accurate prediction of financial market volatility. In order to discriminate different volatility characteristics hidden in original financial time series, which are decomposed firstly different scale components (i.e., approximation and details) using MODWT. The approximation is predicted by a hybrid forecasting model that combines ARIMA with FNN. ARIMA model is used to generate a linear forecast, and then FNN is developed as a tool for nonlinear pattern recognition to correct the estimation error in ARIMA forecast. On the other hand, details are predicted by Elman networks. Experimental results on three exchange rates and five performance measures confirmed that propose model is the best predictors, which substantially reduces the prediction errors and performs better than pairs of competing models. Therefore, the proposed model can also be applied to other financial forecasting due to its capability of extracting key features hidden in financial market dynamics.

References [1] [2] [3]

[4] [5] [6]

[7]

Meese R A and Rogoff K S, Empirical exchange rate models of the seventies: Do they fit out of sample? Journal of International Economics, 1983, 14(1–2): 3–24. Marcos D B, Maximo C, and Gabriel P Q, Short-run forecasting of the euro-dollar exchange rate with economic fundamentals, Journal of International Money and Finance, 2012, 31(2): 377–396. Pasquale D C, Lucio S, and Giulia S, The predictive information content of external imbalances for exchange rate returns: How much is it worth? Review of Economics and Statistics, 2012, 94(1): 100–115. Kilian L and Taylor M P, Why is it so difficult to beat random walk forecast of exchange rates, Journal of International Economics, 2003, 60(1): 85–107. Yu L, Lai K K, and Wang S Y, Multistage RBF neural network ensemble learning for exchange rates forecasting, Neurocomputing, 2008, 71(16–18): 3295–3302. Yu L, Wang S Y, Lai K K, and Huang W W, Developing and assessing an intelligent forex rolling forecasting and trading decision support system for online e-service, International Journal of Intelligent Systems, 2007, 22(5): 475–499. Yu L, Chen H H, Wang S Y, and Lai K K, Evolving least squares support vector machines

236

[8] [9]

[10] [11] [12] [13] [14] [15]

[16] [17]

[18]

[19] [20] [21]

XIAO YI, et al. for stock market trend mining, IEEE Transactions on Evolutionary Computation, 2009, 13(1): 87–102. Wang Y Q, Wang S Y, and Lai K K, Measuring financial risk with generalized asymmetric least squares regression, Applied Soft Computing, 2011, 11(8): 5793–5800. Huang W, Lai K K, Nakamori Y, Wang S Y, and Yu L, Neural networks in finance and economics forecasting, International Journal of Information Technology and Decision Making, 2007, 6(1): 113–140. Yu L, Wang S Y, and Lai K K, A neural-network-based nonlinear metamodeling approach to financial time series forecasting, Applied Soft Computing, 2009, 9(2): 563–574. Yu L, Wang S Y, Lai K K, and Wen F H, A multiscale neural network learning paradigm for financial crisis forecasting, Neurocomputing, 2010, 73(4–6): 716–725. Dhamija A K and Bhalla V K, Exchange rate forecasting: Comparison of various architectures of neural networks, Neural Computing & Applications, 2011, 20(3): 355–363. Yu L, Wang S Y, and Lai K K, Credit risk assessment with a multistage neural network ensemble learning approach, Expert Systems with Applications, 2008, 34(2): 1434–1444. Yu L, Wang S Y, and Lai K K, Foreign-Exchange-Rate Forecasting with Artificial Neural Networks, Springer, New York, 2007. Xiao Y, Xiao J, and Wang S Y, A hybrid forecasting model for non-stationary time series: An application to container throughput prediction, International Journal of Knowledge and Systems Science, 2012, 3(2): 67–81. Xiao Y, Xiao J, and Wang S Y, A hybrid model for time series forecasting, Human Systems Management, 2012, 31(2): 133–143. Xiao Y, Xiao J, Lu F B, and Wang S Y, Ensemble ANNs-PSO-GA approach for day-ahead stock e-exchange prices forecasting, International Journal of Computational Intelligence Systems, 2013, 6(1): 96–114. Yu L, Wang S Y, and Lai K K, A novel nonlinear ensemble forecasting model incorporating GLAR and ANN for foreign exchange rates, Computers & Operations Research, 2005, 32(10): 2523–2541. Percival D B and Walden A T, Wavelet Methods for Time Series Analysis, Cambridge University Press, Cambridge, 2000. Baillie R T and Bollerslev T, Cointegration, fractional cointegration, and exchange rate dynamics, Journal of Finance, 1994, 49(2): 737–745. Zhang G, Patuwo B E, and Hu M Y, Forecasting with artificial neural networks: The state of the art, International Journal of Forecasting, 1998, 14: 35–62.