Forecasting financial time series using a low complexity ... - Core

18 downloads 0 Views 3MB Size Report
work for predicting the financial time series data like the stock market indices over a time frame .... stock prices of Bombay Stock Exchange data and Standard &.
Journal of King Saud University – Computer and Information Sciences (2015) xxx, xxx–xxx

King Saud University

Journal of King Saud University – Computer and Information Sciences www.ksu.edu.sa www.sciencedirect.com

Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach Ajit Kumar Rout a, P.K. Dash b,*, Rajashree Dash b, Ranjeeta Bisoi b a b

G.M.R. Institute of Technology, Rajam, Andhra Pradesh, India S.O.A. University, Bhubaneswar, India

Received 14 March 2015; revised 24 May 2015; accepted 3 June 2015

KEYWORDS Low complexity FLANN models; Recurrent computationally efficient FLANN; Differential Evolution; Hybrid Moderate Random Search PSO

Abstract The paper presents a low complexity recurrent Functional Link Artificial Neural Network for predicting the financial time series data like the stock market indices over a time frame varying from 1 day ahead to 1 month ahead. Although different types of basis functions have been used for low complexity neural networks earlier for stock market prediction, a comparative study is needed to choose the optimal combinations of these for a reasonably accurate forecast. Further several evolutionary learning methods like the Particle Swarm Optimization (PSO) and modified version of its new variant (HMRPSO), and the Differential Evolution (DE) are adopted here to find the optimal weights for the recurrent computationally efficient functional link neural network (RCEFLANN) using a combination of linear and hyperbolic tangent basis functions. The performance of the recurrent computationally efficient FLANN model is compared with that of low complexity neural networks using the Trigonometric, Chebyshev, Laguerre, Legendre, and tangent hyperbolic basis functions in predicting stock prices of Bombay Stock Exchange data and Standard & Poor’s 500 data sets using different evolutionary methods and has been presented in this paper and the results clearly reveal that the recurrent FLANN model trained with the DE outperforms all other FLANN models similarly trained. Ó 2015 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

* Corresponding author. E-mail address: [email protected] (P.K. Dash). Peer review under responsibility of King Saud University.

Production and hosting by Elsevier

Financial time series data are more complicated than other statistical data due to the long term trends, cyclical variations, seasonal variations and irregular movements. Predicting such highly fluctuating and irregular data is usually subject to large errors. So developing more realistic models for predicting financial time series data to extract meaningful statistics from it, more effectively and accurately is a great interest of research

http://dx.doi.org/10.1016/j.jksuci.2015.06.002 1319-1578 Ó 2015 The Authors. Production and hosting by Elsevier B.V. on behalf of King Saud University. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

2 in financial data mining. The traditional statistical models used for financial forecasting were simple, but suffered from several shortcomings due to the nonlinearity of data. Hence researchers have developed more efficient and accurate soft computing methods like Artificial Neural Network (ANN); Fuzzy Information Systems (FIS), Support Vector Machine (SVM), Rough Set theory etc. for financial forecasting. Various ANN based methods like Multi Layer Perception (MLP) Network, Radial Basis Function Neural Network (RBFNN), Wavelet Neural Network (WNN), Local Linear Wavelet Neural Network (LLWNN), Recurrent Neural Network (RNN) and Functional Link Artificial Neural Network (FLANN) are extensively used for stock market prediction due to their inherent capabilities to identify complex nonlinear relationship present in the time series data based on historical data and to approximate any nonlinear function to a high degree of accuracy. The use of ANN to predict the behavior and tendencies of stocks has demonstrated itself to be a viable alternative to existing conventional techniques (Andrade de Oliveira and Nobre, 2011; Naeini et al., 2010; Song et al., 2007; Lee and Chen, 2007; Ma et al., 2010). A system of time series data analysis has been proposed in Kozarzewski (2010) for predicting the future values, based on wavelets preprocessing and neural networks clustering that has been tested as a tool for supporting stock market investment decisions and shows good prediction accuracy of the method. MLP neural networks are mostly used by the researchers for its inherent capabilities to approximate any non-linear function to a high degree of accuracy (Lin and Feng, 2010; Tahersima et al., 2011). But these models suffer from slow convergence, local minimum, over fitting, have high computational cost and need large number of iterations for its training due to the availability of hidden layer. To overcome these limitations, a different kind of ANN i.e. Functional Link ANN (Proposed by Pao (1989)) having a single layer architecture with no hidden layers has been developed. The mathematical expression and computational calculation of a FLANN structure is evaluated as per MLP. But it possesses a higher rate of convergence and lesser computational load than those of a MLP structure (Majhi et al., 2005; Chakravarty and Dash, 2009). A wide variety of FLANNs with functional expansion using orthogonal trigonometric functions (Dehuri et al., 2012; Mili and Hamdi, 2012; Patra et al., 2009), using Chebyshev polynomial (Mishra et al., 2009; Jiang et al., 2012; Li et al., 2012), using Laguerre polynomial (Chandra et al., 2009) and using Legendre orthogonal polynomial (Nanda et al., 2011; George and Panda, 2012; Rodriguez, 2009; Das and Satapathy, 2011; Patra and Bornand, 2010) has been discussed in the literature. The well known Back Propagation algorithm is commonly used to update the weights of FLANN. In Yogi et al. (2010), a novel method using PSO for training trigonometric FLANN has been discussed for equalization of digital communication channels. In this paper, the detailed architecture and mathematical modeling of various polynomial and trigonometric FLANNs have been described along with a new computationally efficient and robust FLANN, and its recurrent version. It is well known that the recurrent neural networks (RNNs) usually provide a smaller architecture than most of the nonrecursive neural networks like MLP, RBFNN, etc. Also their feedback properties make them dynamic and more efficient to model nonlinear systems accurately which are imperative for nonlinear prediction and time series forecasting. Many of the Autoregressive

A.K. Rout et al. Moving Average (ARMA) processes have been accurately modeled by RNNs for nonlinear dynamic system identification. One of the familiar approaches of training the RNNs is the RealTime Recurrent Learning (RTRL) (Ampolucci et al., 1999), which has problems of stability and slow convergence. In nonlinear time series forecasting problems it gets trapped in local minima and cannot guarantee to find global minima. On the other hand, evolutionary learning techniques such as Differential Evolution, particle swarm optimization, genetic algorithm, bacteria foraging, etc. have been applied to time series forecasting successively. DE is found to be efficient among them and outperforms other evolutionary algorithms since it is simpler to apply and involves less computation with less function parameters to be optimized as compared to other algorithms. DE is chosen because it is a simple but powerful global optimization method and converges faster than PSO. A comparative study between Differential Evolution (DE) and Particle Swarm Optimization (PSO) in the training and testing of feed-forward neural network for the prediction of daily stock market prices has shown that DE provides a faster convergence speed and better accuracy than PSO algorithm in the prediction of fluctuated time series (Abdual-Salam et al., 2010). Differential Evolution based FLANN has also shown its superiority over Back Propagation based Trigonometric FLANN in Indian Stock Market prediction (Hatem and Mustafa, 2012). The convergence speed is also faster to find a best global solution by escaping from local minima even for multiple optimal solutions. Thus, in this paper various evolutionary learning methods like PSO, HMRPSO, DE for improving the performance of different types of FLANN models have been discussed. Comparing the performance of various FLANN models for predicting stock prices of Bombay Stock Exchange data and Standard & Poor’s 500 data set, it has been tried to find out the best FLANN among them. The rest of the paper is organized as follows. In Sections 2 and 3, the detailed architecture of various FLANNS and various evolutionary learning algorithms for training has been described. The simulation study for demonstrating the prediction performance of different FLANNS has been carried out in Section 4. This section also provides a comparative result of training and testing of different FLANNS using PSO, HMRPSO, and DE (Mohapatra et al., 2012; Qin et al., 2008; Wang et al., 2011) based learning for predicting financial time series data. Finally conclusions are drawn in Section 5. 2. Architecture of low complexity neural network models The FLANN originally proposed by Pao in 1992 is a single layer single neuron architecture, having two components: Functional expansion component and Learning component. The functional block helps to introduce nonlinearity by expanding the input space to a higher dimensional space through a basis function without using any hidden layers like MLP structure. The mathematical expression and computational calculation of a FLANN structure is same as MLP. But it possesses a higher rate of convergence and lesser computational cost than those of a MLP structure. A wider application of FLANN models for solving non linear problems like channel equalization, non linear dynamic system identification, electric load forecasting, prediction of earthquake, and financial forecasting has demonstrated its viability, robustness and ease of computation. The functional expansion block comprises either a trigonometric block or a polynomial

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network

3

block. Further the polynomial block can be expressed in terms of Chebyshev, Laguerre, or Legendre basis functions. Trigonometric FLANN (TRFLANN) is a single layer neural network in which the original input pattern in a lower dimensional space is expanded to a higher dimensional space using orthogonal trigonometric functions (Chakravarty and Dash, 2009; Dehuri et al., 2012; Mili and Hamdi, 2012). With the order m any n dimensional input pattern X = [x1, x2 . . . xn]T is expanded to a p dimensional pattern TX by Trigonometric functional expansion as TX = [1, TF1(x1), TF2(x1) . . . TFm(x1), TF1(x2), TF2(x2) . . . TFm(x2) . . . TF1(xn), TF2(xn) . . . TFm(xn)]T where p = m * n + 1. Each xi in input pattern is expanded using trigonometric functions with order m as {1 sin(p*xi) cos (p*xi) sin(2p*xi) cos(2p*xi) . . . sin(mp*xi)cos(mp*xi)}. Trigonometric function: TF0 ðxÞ ¼ 1; TF1 ¼ cos px; TF2 ¼ sin px; TF3 ¼ cos 2px; TF4 ¼ sin 2px; TF5 ¼ cos 3px; TF6 ¼ sin 3px

ð1Þ

Chebyshev polynomial: The recursive formula to generate higher order Chebyshev polynomials is given by Chpþ1 ðxÞ ¼ 2xChp ðxÞ  Chp1 x;

ð2Þ Laguerre polynomials: The recursive formula to generate higher order Laguerre polynomials is given by 1 Lpþ1 ðxÞ ¼ pþ1 ½ð2p þ 1ÞLp ðxÞ  pLp1 ðxÞ

ð3Þ

La5 ðxÞ ¼ x =120 þ 5x =24  5x =3 þ 5x  5x þ 1 5

4

3

2

Legendre polynomials: The Legendre polynomials are denoted by Lep(X), where p is the order and 1 < x < 1 is the argument of the polynomial. It constitutes a set of orthogonal polynomials as solutions to the differential equation:   d dy ð1  x2 Þ þ pðp þ 1Þy ¼ 0 dx dx The zeroth and the first order Legendre polynomials are respectively given by, Le0(x) = 1 and Le1(x) = x. The higher order polynomials are Le2 ðxÞ ¼ 3x2 =2  1=2; Le3 ðxÞ ¼ 5x3 =2  3x=2; Le4 ¼ 35x4 =8  15x2 =4 þ 3=8 Le5 ðxÞ ¼ 63x =8  35x =4 þ 15x=8; Le6 ¼ 231x6 =16  315x4 =16 þ 105x2 =16  5=16 5

3

uðkÞTrigono

ð4Þ

The predicted sample x(t+k) can be represented as a weighted sum of nonlinear polynomial arrays, Pi(xj). The inherent nonlinearities in the polynomials attempt to accommodate the nonlinear causal relation of the future sample with the samples prior to it. Using trigonometric expansion blocks and a sample index k, the following relations are obtained:

3T 2

dð1;1Þ ðkÞ

7 6 6 dð2;1Þ ðkÞ 7 7 6 7 6 .. 7 6 7 6 . 7 ¼ dð0Þ ðkÞ þ 6 7 6 . 7 6 .. 7 6 7 6 7 6d 4 ðm1;1Þ ðkÞ 5

TF1 ðx1 ðkÞÞ

3

7 6 6 TF2 ðx1 ðkÞÞ 7 7 6 7 6 .. 7 6 7 6 . 7 þ ...... 6 7 6 . 7 6 .. 7 6 7 6 6 TF ðx ðkÞÞ 7 5 4 m1 1

dðm;1Þ ðkÞ TFm ðx1 ðkÞÞ 3T 2 3 dð1;nÞ ðkÞ TF1 ðxn ðkÞÞ 6 7 6 7 6 dð2;nÞ ðkÞ 7 6 TF2 ðxn ðkÞÞ 7 6 7 6 7 6 7 6 7 .. .. 6 7 6 7 6 7 6 7 . . 7 6 7 þ6 6 7 6 7 . . 6 7 6 7 .. .. 6 7 6 7 6 7 6 7 6d 6 7 7 4 ðm1;nÞ ðkÞ 5 4 TFm1 ðxn ðkÞÞ 5 2

dðm;nÞ ðkÞ

ð5Þ

TFm ðxn ðkÞÞ

and this FLANN is named as TRFLANN. Using Chebyshev polynomials, the CHFLANN output is obtained as 2

Ch0 ðxÞ ¼ 1; Ch1 ðxÞ ¼ x; Ch2 ¼ 2x2  1; Ch3 ¼ 4x3  3x; Ch4 ðxÞ ¼ 8x4  8x2 þ 1; Ch5 ðxÞ ¼ 16x5  20x3 þ 5x; Ch6 ¼ 32x6  48x4 þ 18x2  1:

La0 ðxÞ ¼ 1; La1 ðxÞ ¼ 1  x; La2 ¼ 0:5x2  2x þ 1; La3 ¼ x3 =6 þ 3x2 =2  3x þ 1; La4 ðxÞ ¼ x4 =24  2x3 =3 þ 3x2  4x þ 1;

2

að1;1Þ ðkÞ að2;1Þ ðkÞ .. . .. .

3T 2

6 7 6 7 6 7 6 7 6 7 7 uðkÞChebyshev ¼ að0Þ ðkÞ þ 6 6 7 6 7 6 7 6 7 4 aðm1;1Þ ðkÞ 5

C1 ðx1 ðkÞÞ

6 C ðx ðkÞÞ 7 6 2 1 7 6 7 .. 6 7 6 7 . 6 7 þ ...... 6 7 . .. 6 7 6 7 6 7 4 Cm1 ðx1 ðkÞÞ 5

aðm;1Þ ðkÞ Cm ðx1 ðkÞÞ 3T 2 3 C1 ðxn ðkÞÞ 6 7 6 C ðx ðkÞÞ 7 6 7 6 2 n 7 6 7 6 7 .. 6 7 6 7 6 7 6 7 . 7 6 7 þ6 6 7 6 7 . .. 6 7 6 7 6 7 6 7 6 7 6 7 4 aðm1;nÞ ðkÞ 5 4 Cm1 ðxn ðkÞÞ 5 aðm;nÞ ðkÞ Cm ðxn ðkÞÞ 2

3

að1;nÞ ðkÞ að2;nÞðkÞ .. . .. .

ð6Þ

Using Laguerre polynomials, the LAGFLANN output is found as 2 3 2 3 bð1;1Þ ðkÞ T La1 ðx1 ðkÞÞ 6 7 6 7 6 bð2;1Þ ðkÞ 7 6 La2 ðx1 ðkÞÞ 7 6 7 6 7 6 7 6 7 .. .. 6 7 6 7 6 7 6 7 . . Laguerre 7 6 7 þ ...... ¼ bð0Þ ðkÞ þ 6 uðkÞ 6 7 6 7 . . 6 7 6 7 .. .. 6 7 6 7 6 7 6 7 6b 7 6 7 4 ðm1;1Þ ðkÞ 5 4 Lam1 ðx1 ðkÞÞ 5 bðm;1Þ ðkÞ Lam ðx1 ðkÞÞ 3T 2 3 bð1;nÞ ðkÞ La1 ðxn ðkÞÞ 6 7 6 7 6 bð2;nÞ ðkÞ 7 6 La2 ðxn ðkÞÞ 7 6 7 6 7 6 7 6 7 .. .. 6 7 6 7 6 7 6 7 . . 7 6 7 þ6 6 7 6 7 .. .. 6 7 6 7 . . 6 7 6 7 6 7 6 7 6b 7 6 La ðx ðkÞÞ 7 ðkÞ 4 ðm1;nÞ 5 4 m1 n 5 2

bðm;nÞ ðkÞ

ð7Þ

Lam ðxn ðkÞÞ

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

4

A.K. Rout et al.

Using Legendre polynomials, Eq. (8) gives the LEGFLANN output 3 2 3 cð1;1Þ ðkÞ T Le1 ðx1 ðkÞÞ 6 c ðkÞ 7 6 Le ðx ðkÞÞ 7 2 1 7 6 7 6 ð2;1Þ 7 6 7 6 .. .. 7 6 7 6 7 6 7 6 . . 7 6 7 þ ...... ¼ cð0Þ ðkÞ þ 6 7 6 7 6 . . .. .. 7 6 7 6 7 6 7 6 7 6 7 6 4 cðm1;1Þ ðkÞ 5 4 Lem1 ðx1 ðkÞÞ 5 cðm;1Þ ðkÞ Lem ðx1 ðkÞÞ 3T 2 3 2 cð1;nÞ ðkÞ Le1 ðxn ðkÞÞ 6 c ðkÞ 7 6 Le ðx ðkÞÞ 7 2 n 7 6 7 6 ð2;nÞ 7 6 7 6 .. .. 7 6 7 6 7 6 7 6 . . 7 7 6 6 ð8Þ þ6 7 6 7 . . .. .. 7 6 7 6 7 6 7 6 7 6 7 6 4 cðm1;nÞ ðkÞ 5 4 Lem1 ðxn ðkÞÞ 5 2

uðkÞLegendre

cðm;nÞ ðkÞ

x1



x2 •

. . .

d(k)

xn • ()

.. .

. . .

+



2

e(k)

(a) Computationally Efficient FLANN

x1 • x2 •

. ..

xn •

Finally the output from each of the FLANN model is passed through an activation block to give the output as

. ..

F E B

...

2

()

+



y(k)

p

y(k-1)

ð9Þ

yref

1

yðkÞ ¼ qSðuðkÞTrigono Þ

> or yðkÞ ¼ qSðuðkÞLaguerre Þ; > > > > ; Legendre or yðkÞ ¼ qSðuðkÞ Þ

y(k)

p

Lem ðxn ðkÞÞ

9 > > > > Chebyshev > or yðkÞ ¼ qSðuðkÞ Þ; =

F E B

1

e(k)



.. •

y(k-2)

y(k-m)

• z-r

where q controls the output magnitude and S is a nonlinear function given by SðuðkÞÞ ¼

2 1 1 þ e2uðkÞ

z-2 z-1

ð10Þ

(b) Computationally Efficient Recurrent FLANN

Figure 1

To obtain the optimal d, a, b and c values, an error minimization algorithm can be used. and

2.1. Computationally efficient FLANN (CEFLANN) Computationally efficient FLANN is a single layer ANN that uses trigonometric basis functions for functional expansion. Unlike earlier FLANNS, where each input in the input pattern is expanded through a set of nonlinear functions, here all the inputs of the input pattern passes through a few set of nonlinear functions to produce the expanded input pattern; the new FLANN comprises only a few functional blocks of nonlinear functions for the inputs and thereby result in a high dimensional input space for the neural network. This new architecture of FLANN has much less computational requirement and possesses high convergence speed. Fig. 1 depicts the single layer computationally efficient FLANN architecture. In this architecture a cascaded FIR element and a functional expansion block are used for the neural network. The output of this network is obtained as yðkÞcef ¼ qSðuðkÞÞfW1 ðkÞXðkÞ þ W2 ðkÞ:FEðkÞg;

FEi ¼ tanh ai0 þ

! n X aij xj ; i ¼ 1; 2; . . . . . . ; p

ð13Þ

j¼1

where W1 and W2 are weight vectors for the linear part and functional expansion part, and p is the total number of expansions with n number of inputs; S0 is the derivative of the activation function S. 2.2. Adaptation of weights of the FLANN models

ð11Þ

The weights and associated parameters of the four FLANN models and the CEFLANN model are updated at the end of each experiment by computing the error between the desired output and the estimated output. The error at the kth time step of the Lth experiment is expressed as eðkÞ ¼ ydes ðkÞ  yðkÞ; and the ydes is the desired FLANN output. The cost function is 1 ð14Þ EðkÞ ¼ e2 ðkÞ 2

ð12Þ

Using gradient descent algorithm, the weights of Trigonometric, Chebyshev, WTri ; WChe ; WLag ; WLege Laguerre, and Legendre FLANN model, respectively are updated as

and XðkÞ ¼ ½x1 ðkÞ; x2 ðkÞ . . . . . . ; xn ðkÞT FEðkÞ ¼ ½FE1 ðkÞ; FE1 ðkÞ; FE1 ðkÞ; . . . . . . ; FEp ðkÞT ;

Computationally efficient FLANN models.

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network WFL ðkÞ ¼ WFL ðk  1Þ þ gqS0 ðuðkÞÞðXFL ðkÞÞ

5

0

where

þ #ðW ðk  1Þ  W ðk  2ÞÞ FL

ð15Þ

FL

where g and # are the learning rate and momentum terms, and WFL is the weight vector for the corresponding FLANN model; and XFL stands for the functional expansion terms of the input. With trigonometric expansions XFL becomes XFL ¼ ½1/1 /2 /3 . . . . . . /mn2 /mn1 /mn T ¼ ½1x1 cos px1 sin px1 . . . . . . xn cos pxn sin pxn T with total number of terms being equal to mn þ 1. For the first four FLANN models, the weight vector WFL is given by WTri ¼ ½dð0Þ ; dð1;1Þ ; . . . ; dðm;nÞ T ; WChe ¼ ½að0Þ ; að1;1Þ ; . . . ; aðm;nÞ T Leg

W

T

¼ ½bð0Þ ; bð1;1Þ ; . . . ; bðm;nÞ  ; W

Lege

¼ ½cð0Þ ; cð1;1Þ ; . . . ; cðm;nÞ 

For the CEFLANN model ð17Þ

W2 ðkÞ ¼ W2 ðk  1Þ ¼ qS0 ðuðkÞÞeðkÞFET

ð23Þ

The most common gradient based algorithms used for online training of recurrent neural networks are BP algorithms and real-time recurrent learning (RTRL) (Ampolucci et al., 1999). The RTRL algorithm is shown in the following steps: Using the same cost function for the recurrent FLANN model, for a particular weight wðkÞ; the change of weight is obtained as DwðkÞ ¼ g

@eðkÞ @yðkÞ ¼ geðkÞS0 ðucef ðkÞÞ @wðkÞ @wðkÞ

ð24Þ

where g is the learning rate. The partial derivative of the above Eq. (24) is obtained in a modified form for the RTRL algorithm as @yðiÞ @ak

¼ yði  kÞ þ

n X aj @yðijÞ ; k ¼ 1; 2; . . . ; n @ak j¼1

and the coefficients of the ith functional block are obtained as ! n X 0 2 ai0 ðkÞ ¼ ai0 ðk  1Þ þ qW2;i S ðuðkÞÞ sec h ai0 þ aij xj ð18Þ j¼1

aij ðkÞ ¼ aij ðk  1Þ þ qW2;i S0 ðuðkÞÞ sec2 h ai0 þ

W3 ¼ ½c1 c2 c3 . . . cp T

T

ð16Þ W1 ðkÞ ¼ W1 ðk  1Þ ¼ qS0 ðuðkÞÞeðkÞXT

W1 ¼ ½a1 a2 a3 . . . an T ; W2 ¼ ½b1 b2 a3 . . . bm T ;

n X

n X @yðiÞ ¼ x þ aj @yðijÞ ; k ¼ 1; 2; . . . ; m k @bk @bk j¼1 n X @yðiÞ ¼ / þ aj @yðijÞ ; k ¼ 1; 2; . . . ; p k @ck @ck j¼1

!

aij xj xj ð19Þ

@yðiÞ @wk0

¼ ck sec2 /k þ

j¼1

j¼1 @yðiÞ @wkr

2.3. Computationally efficient recurrent FLANN (RCEFLANN)

ð20Þ

The weight adjustment formulas are, therefore, obtained as (with k taking values n, m, and p for the weights of the recurrent, input, functional expansion parts, respectively): ak ðiÞ ¼ ak ði  1Þ þ geðkÞS0 ðucef ðkÞÞ @yðiÞ @ak bk ðiÞ ¼ bk ði  1Þ þ geðkÞS0 ðucef ðkÞÞ @yðiÞ @bk ck ðiÞ ¼ ck ði  1Þ þ geðkÞS0 ðucef ðkÞÞ @yðiÞ @ck

ð26Þ

wkr ðiÞ ¼ wkr ði  1Þ þ geðkÞS0 ðucef ðkÞÞ @yðiÞ @wkr Ppþqþr where uj ðkÞ ¼ j¼1 wj ðkÞvj , and di;j is a Kronecker delta equal to 1 when j = k and 0 otherwise. Further if the learning rate is kept small, the weights do not change rapidly and hence @yði  1Þ @yði  1Þ  @ak @ak ði  1Þ

ð27Þ

Denoting the gradient ð21Þ

and for a low complexity expansion /i takes the form

pk ðiÞ ¼

@yðiÞ @wk

and the gradient values in successive iterations are generated assuming

/i ¼ tanhðwi0 þ wi1 x1 þ wi2 x2 þ . . . þ wim Þ; i ¼ 1; 2; . . . ; p

pk ð0Þ ¼ 0

Thus the output is obtained as yðkÞ ¼ cSðu ðkÞÞ cef

¼ cSðW1 ðkÞRðkÞÞ þ W2 ðkÞXðkÞ þ W3 ðkÞ:FEðkÞ

n X aj @yðijÞ ; k ¼ 1; 2; . . . ; p; r ¼ 1; 2; . . . ; p @wkr

ð25Þ

which is written in an expanded form as VðkÞ ¼ ½y1 ðk  1Þ; yðk  2Þ; . . . ; y1 ðk  nÞ; x1 ; x2 . . . ; xm ; /1 :/2 . . . ; /p 

¼ ck xk sec2 /k þ

j¼1

In conventional FLANN models since no output lagging terms are used there is no correlation between the training samples. However, when lagged output samples are used as inputs, a correlation exists between the training samples and hence an incremental learning procedure is required for the adjustment of weights. A recurrent version of this FLANN is shown in Fig. 1, where one step delayed output samples are fed to the input to provide a more accurate forecasting scenario. As shown in Fig. 1 the input vector contains a total number m þ n þ p inputs comprising n inputs from the delayed outputs, m inputs from the stock closing price indices, p inputs from the functional expansion. Thus the input vector is obtained as VðkÞ ¼ ½RT ðkÞ; XT ðkÞ; FET ðkÞ

n X aj @yðijÞ ; k ¼ 1; 2; . . . ; p @wk0

ð22Þ

ð28Þ

In both these algorithms gradient descent based on first order derivatives is used to update the synaptic weights of the network. However, both these approaches, exhibit slow

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

6

A.K. Rout et al.

convergence rates because of the small learning rates required, and most often they become trapped to local minima. To avoid the common drawbacks of back propagation algorithm and to increase accuracy, different methods have been proposed that include additional momentum method, self-adaptive learning rate adjustment method, and various search algorithms like GA, PSO, DE algorithm in the training step of the neural network to optimize the parameters of the network like the network weights and the number of hidden units in the hidden layer. In Section 3, the various evolutionary learning methods like Particle Swarm Optimization, Differential Evolution and Hybrid Moderate Random Search PSO have been described for training various FLANN models. Evolutionary algorithms act as excellent global optimizers for real parameter problems.

self-adaptive strategy for the control parameters of DE like F and Cr is adopted to improve the robustness of the DE algorithm. The pseudo code for DE implementation is given below: 3.1.1. Pseudo code for DE implementation Input: population size Np, No. of variables to be optimized (Dimension D), initial scaling and mutation parameters F; F1 ; F2 , and Cr. G = total number of generations, X = target vector, Strategy candidate pool: ‘‘DE/rand/2” g g g g g Population P g ¼ fX ðiÞ ; . . . . . . ; X ðNpÞ g; X ðiÞ ¼ fxði;1Þ ; xði;2Þ ;...; g g xði;jÞ ; . . . ; xði;DÞ g uniformly distributed. 1. While stopping criterion is not satisfied do 2. for i ¼ 1 to Np

3. Evolutionary learning methods

F1 ¼ FL1 þ ðFU1  FL1 Þ  randð0; 1Þ F2 ¼ FL2 þ ðFU2  FL2 Þ  randð0; 1Þ

The problem described can be formulated as an objective function for error minimization using the equation below. vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N  uX dðkÞ  yðkÞ 2 pffiffiffiffi ERROR ¼ t ð29Þ N k¼1

Generate the mutant vector

where prediction is done k days ahead and y(k) is represented as a function of weights and the prior values of the time series: 0 1 B C ¼ f@dð0Þ ; dð1;1Þ ; . . . ; dðm;nÞ ; xðtÞ ; . . . ; xðknþ1Þ A yTrigono ðkÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl} variables

constants

0

variables

ð31Þ

The indices, l; d; c; n; g 2 [1,Np]end for 3. for i ¼ 1 to Np jrand ¼ rndinitð1; DÞ for j ¼ 1 to D The crossover rate is adapted as

variables

( ð32Þ

¼

g vði;jÞ

ifðrandði;jÞ 6 CrÞ or j ¼ jrand

g xði;jÞ

otherwise

) ð38Þ

end for end for

1

C B ¼ f@cð0Þ ; cð1;1Þ ; . . . ; cðm;nÞ ; xðtÞ ; . . . ; xðknþ1Þ A yLegendre ðkÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}

ð33Þ

constants

The dimension of the problem to be solved by the evolutionary algorithm would be (m + n + 1). The objective function for optimization is chosen as 1 J¼ 1 þ ERROR2

g uði;jÞ

constants

0

variables

ð37Þ

where rand3 is a random number between [0,1]. g using the target vector Generate trial vector UðiÞ

1

B C ¼ f@bð0Þ ; bð1;1Þ ; . . . ; bðm;nÞ ; xðtÞ ; . . . ; xðknþ1Þ A |fflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}

ð36Þ

g g g g g ¼ fuði;1Þ ; uði;2Þ ; . . . ; uði;jÞ ; . . . ; uði;DÞ g UðiÞ

constants

0

g g g g g ¼ fmði;1Þ ; mði;2Þ ; . . . ; mði;jÞ ; . . . ; mði;DÞ g VðiÞ   g g g g g g ¼ XðlÞ þ F1  XðdÞ  XðcÞ  XðgÞ þ F2  XðfÞ VðiÞ

Cr ¼ CrL þ ðCrU  CrL Þ  rand3

1

B C ¼ f@að0Þ ; að1;1Þ ; . . . ; aðm;nÞ ; xðtÞ ; . . . ; xðknþ1Þ A yChebyshev ðkÞ |fflfflfflfflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflfflfflfflffl} |fflfflfflfflfflfflfflfflfflfflfflffl{zfflfflfflfflfflfflfflfflfflfflfflffl}

yLaguerre ðkÞ

ð30Þ

ð35Þ

ð34Þ

4. for i ¼ 1 to NpEvaluate the objective function of the trial vector g g g g g UðiÞ ¼ fuði;1Þ ; uði;2Þ ; . . . ; uði;jÞ ; . . . ; uði;DÞ g

ð39Þ

The objective function is obtained from Eq. (34) as g Þ¼J fðUðiÞ g g g gþ1 g Þ < fðXðiÞ Þ; Then Xgþ1 If fðUðiÞ ðiÞ ¼ UðiÞ ; fðXðiÞ Þ ¼ fðUðiÞ Þ

3.1. DE based learning Differential Evolution (DE) is a population-based stochastic function optimizer, which uses a rather greedy and less stochastic approach for problem solving in comparison to classical evolutionary algorithms, such as genetic algorithms, evolutionary programing, and PSO. DE combines simple arithmetical operators with the classical operators of recombination, mutation, and selection to evolve from a randomly generated starting population to a final solution. Here, a

g g g g g g Þ < fðXðbestÞ Þ; Then XðbestÞ ¼ UðiÞ ; fðXðbestÞ Þ ¼ fðUðiÞ Þ If fðUðiÞ

ð40Þ end ifend forg = g + 1 5. end While 3.2. Adaptive HMRPSO based learning In the conventional PSO algorithm the particles are initialized randomly and updated afterward according to:

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network

7

(b) RMSE error comparison of BSE data using evolutionary algorithms and LEFLANN

(a) RMSE error comparison of S&P500 data using evolutionary algorithms and LEFLANN

(c) RMSE error comparison of S&P500 data using different evolutionary algorithms and CEFLANN

(d) RMSE error comparison of BSE data using different evolutionary algorithms and CEFLANN

Figure 2 RMSE errors of Legender and computationally efficient FLANNs for S&P500 and BSE stock market indices using different Evolutionary algorithms.

xkþ1 ¼ xkij þ Vkþ1 ij ij Vkþ1 ij

¼

xVkij

  þ c1 r1 pbestkij  xkij þ c2 r2 gbestkj  xkij

ð41Þ

where w, c1, c2 are inertia, cognitive and social acceleration constants respectively, r1 and r2 are random numbers within [0,1]. pbest is the best solution of the particle achieved so far and indicates the tendency of the individual particles to replicate their corresponding past behaviors that have been successful. gbest is the global best solution so far, which indicates the tendency of the particles to follow the success of others (Lu, 2011; Sanjeevi et al., 2011; Ampolucci et al., 1999). HMRPSO is a new PSO technique using the moderate random search strategy to enhance the ability of particles to explore the solution spaces more effectively and a new mutation strategy to find the global optimum (Gao and Xu, 2011). This new method improves the convergence rate for the particles by focusing their search in valuable search space regions. The mutation operator is a combination of both global and local mutation. The global mutation enhances the global search ability of MRPSO, when the difference of a particle’s position between two successive iterations on each dimension has

decreased to a given threshold value. The momentum for this particle to roam through the solution space is maintained by resetting the particle in a random position. The local mutation operator enhance the particles to search precisely in the local area of the pbest found so far, which compensates the weaker local search ability, which results from the use of a global mutation operator. Like PSO it does not require any velocity update equation. Using MRS strategy the position update equation is as follows:  xkþ1 ð42Þ ¼ pj þ ac mbestij  xkij ij where mbest ¼

np X pbest i¼1

i

np

pj ¼ r1  pbestij þ ð1  r1Þgbestj ; c ¼ ðr2  r3Þ=r4

ð43Þ ð44Þ

The attractor p is the main moving directions of particles and r1 ; r2 ; r3 are the uniformly distributed random variables within [0, 1], where as r4 is a random variable within [1, 1].

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

8

A.K. Rout et al.

(a) RMSE error comparison of S&P500 data for different FLANNs using DE

(b) MAPE error comparison of BSE data using different FLANNs using DE

(a) RMSE error comparison of BSE data using different FLANNs and DE Figure 3

(b) MAPE error comparison of BSE data using different FLANNs and DE

RMSE and MAPE Errors for different FLANNS and DE.

The mbest term gives step size for the movement of particles and also makes the contribution of all pBest to the evolution of particles. The inertia factor a controls the convergence rate of the method. If the absolute value of the difference between xkþ1 and xkij is smaller than a threshold value Tj, then the ij current particle can use a mutation operator on dimension j. A suitable value is chosen for threshold, such that neither the algorithm spent more time in mutation operator and less time to conduct MRS in solution space, nor it gets a little chance to do mutation such that the particles will need a relatively long time to escape from the local optima. Again a gradient descent method is used to control the value of Tj using the following formula:

If ðFj ¼ kÞ then Fj ¼ 0 and Tj ¼ Tj =m

ð45Þ

Parameter m controls the decline rate of Tj and k controls the mutation frequency of particles. Fj denotes the number of particles that has used the mutation operator on dimension j. Fj is updated by the following equation: Fj ðkÞ ¼ Fj ðk  1Þ þ

np X bkij

ð46Þ

i¼1

where ( bkij ¼

0

if ðabsðxkij  xk1 ij Þ > Tj Þ

1

if ðabsðxkij  xk1 ij Þ 6 Tj Þ

ð47Þ

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network

Figure 4

9

Comparison of actual and predicted S&P500 stock indices with different FLANN models.

Fj is initially set to zero and Tj is initially set to the maximum value taken for the domain of particles position. To enhance the global search ability of the HMRPSO, a global mutation operator is applied using the following formula:

mutation operator is compensated using a local mutation operator as follows:

mutateðxij Þ ¼ ðr5  rangeÞ

where t is the index of the pbest that gives best achievement among all other pbest positions and mutðrÞ returns a value within (0, 1), which is drawn from a standard Gaussian distribution. finding the variance of the population fitness as

ð48Þ

r5 is a random variable within [1,1] and range is the maximum value set for the domain of particles position. Again the weaker local searching ability caused using the global

pbest0tj ¼ pbesttj ð1  mutðrÞÞ

ð49Þ

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

10

A.K. Rout et al.

Figure 5

Comparison of actual and predicted BSE stock indices with different FLANN models.

vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u Np  uX Ji  Javg2 n¼t F0 i¼1

ð50Þ

where Javg is the average fitness of the population of particles in a given generation. Fi is the fitness of the ith particle in the population, M is the total number of particles. F0 ¼ fmaxðjFi  FavgjÞg; i ¼ 1; 2; 3 . . . . . . M

ð51Þ

In the equation given above Fn is the normalizing factor, which is used to limit n. If n is large, the population will be in a random searching mode, while for small n or n ¼ 0, the solution tends toward a premature convergence and will give

the local best position of the particles. To circumvent this phenomenon and to obtain gbest solution, the factor a in (42) controls the convergence rate of the HMRPSO, which is similar with inertia weight used in the PSO. On the one hand, a larger a value enables particles to have a stronger exploration ability but a less exploitation ability, while on the other hand, smaller a allows particles a more precise exploitation ability. The factor a in Eq. (42) is updated using a fuzzy rule base and fuzzy membership values of the change in standard deviation Dn in the following way: 2

l1 l arg e ðnÞ ¼ ejnj ; l1 small ðnÞ ¼ 1  ejnj

2

ð52Þ

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network

11

One day ahead prediction of S&P500 Data using DE/curr to best based RBF 1450 2.2

1400

actual predicted

2

1350 1300

1.8

1250

1.6

1200

1.4

1150

1.2

actual predicted

1100 1050

4 x 10 1 day ahead prediction of BSE Data using DE/curr to best based RBF

1

0

50

100

150

200

250

0.8

0

50

100

150

Days

Figure 6

200 Days

250

300

350

400

Comparison of actual and predicted S&P500 and BSE stock indices with RBF.

Change in standard deviation DnðkÞ ¼ nðkÞ  nðk  1Þ

ð53Þ

The fuzzy rule base for arriving at a weight change is expressed as R1: If jDnj is Large Then Da ¼ rand1 ð Þ Dn R2: If jDnj is Small Then Da ¼ rand2 ð ÞDa

ð54Þ

where the membership functions for Large and Small are given by lL ðnÞ ¼ jDnj; lS ðnÞ ¼ 1  jDnj

Step 8: Step 9: Step 10: Step 11: Step Step Step Step Step

12: 13: 14: 15: 16:

ð55Þ

where rand1 ð Þ and rand2 ð Þ are random numbers between 0 and 1, and 0 6 jDnj 6 1

ð56Þ

Da ¼ jDnj: rand1 ð Þ Dn þ rand2 ð Þ Dn:ð1  jDnjÞ:

ð57Þ

Step 17:

Step 18:

Thus the value of the new weight is obtained as aðkÞ ¼ aðk  1Þ þ Da

ð58Þ

3.2.1. Steps of adaptive HMRPSO based learning Step 1: Expand the input pattern using functional expansion block Step 2: Initialize the position of each individual within a given maximum value Step 3: Find the fitness function value of each particle i.e. the error obtained by applying the weights specified in each particle to the expanded input and applying the nonlinear tanh () function at the output unit Step 4: Initialize the pbest, gbest, pbestt positions of the particles Step 5: Update the particle’s position using the updated value of a using Eq. (58) Step 6: For each dimension j Step 7: if ðabsðxkij  xk1 ij Þ < T j Þ

Step 19:

if (Fj 6 k/2) Apply global mutation operator on xij using Eq. (48) else Apply local mutation operator on best pbest position i.e. pbesttj end if end if Update the Fj and Tj using Eqs. (47) and (46) end for Update the pbest and gbest positions by comparing the fitness value of new mutant particles obtained using global mutation with the fitness value of particles of last iteration Update the position of pbestt and accordingly the gbest position by comparing the fitness value of pbestt obtained using location mutation with its previous fitness value Repeat step 5–17 for each particle in the population until some termination condition is reached, such as predefined number of iterations is reached or the error has satisfied the default precision value Fixed the weight equal to the gbest position and use the network for testing

4. Experimental result analysis In this study the sample data set of Bombay Stock Exchange (BSE) and Standard’s &Poor’s 500(S&P500) collected from Google finance comprising the daily closing prices has been taken for experiment to test the performance of different FLANN models trained using evolutionary methods. The total no. of samples for BSE is 1200 from 1st January 2004 to 31st December 2008 and for S&P500 are 653 from 4th January 2010 to 12th December 2012. Both the data sets are divided into training and testing sets. For BSE data set; the training set consists of initial 800 patterns and the following 400 data are set for testing and for S&P500 dataset the training set

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

12

A.K. Rout et al.

Figure 7

Comparison of actual and predicted S&P500and BSE stock indices with the recurrent CEFLANN model (3–15 days ahead).

consists of initial 400 patterns leaving the rest for testing. To improve the performance initially all the inputs are scaled between 0 and 1 using the min–max normalization as follows: y¼

x  xmin xmax  xmin

where y = normalized value.

ð59Þ

x = value to be normalized xmin = minimum value of the series to be normalized xmax = maximum value of the series to be normalized The Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) has been used to compare the performance of different evolutionary FLANNs for predicting

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network

13

-4

8

x 10

AMAPE ¼

7

Variances

5 4 3 2 1

420

440

460

480 500 520 Time (Days)

540

560

580

600

(a) Variances of S&P 500 Stock data from 23rd July 2010 to 6th May 2011

i¼1

Further the Technical indicators seem to predict the future price value by looking at past patterns. A brief explanation of each indicator is mentioned here: Technical Indicators

-3

6

x 10

5

(i) Five days Moving Average(MA)Moving average (Eq. (64)) is used to emphasize the direction of a trend and smooth out price and volume fluctuations that can confuse interpretation. Pt y MAðnÞ ¼ i¼tn i ð64Þ n

Variances

4

3

2

1

0 800

850

900

950

1000 1050 Time (Days)

1100

1150

1200

th

th

7 March 2007 to 13 October 2008

Computed variances of the S&P 500 and BSE stock

the closing price of the BSE and S&P500 index in 1 day, 3 days, 5 days, 7 days, 10 days, 15 days advance with different learning algorithms. The RMSE and MAPE are defined as: sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n 1X RMSE ¼ ðy  y^k Þ2 ð60Þ n k¼1 k MAPE ¼

n



1X

yk  y^k  100

n k¼1 yk

MA of the n days and yi the closed price of the ith -day. (ii) Five days bias (BIAS)The difference between the closing value and moving average line, which uses the stock price nature of returning back to average price to analyze the stock market in (Eq. (65)). BIAS for n days ; BIASðnÞ ¼ yi 

(b) Variances of BSE Stock data from

Figure 8 indices.

ð62Þ

Before using a single model or a combination of models to predict the future stock market indices, it is assumed that there is a true model or a combination of models for a given time series. In this paper, the variance of forecast errors is used to measure this uncertainty of the neural network model. The smaller the variance, the less uncertain is the model or more accurate is the forecast results. The variance of the forecast errors over a certain span of time is needed and this parameter is obtained as 8 92 > > > > > > > > ND < = X X ND ^ ^ 1 jy  y j jy  y j i i i i 2 rE ¼  ð63Þ P N N i¼1 > X > ND > > i¼1 ð1=NDÞ i¼1 yi > > > > y ð1=NDÞ i : ;

6

0 400

N 1X yi  y^i j j  100 P N i¼1 ð1=NÞ N i¼1 yi

ð61Þ

where yk = actual closing price on kth day ^y k = predicted closing price on kth day n = number of test data. In order to avoid the adverse effect of very small Stock indices over a long period of time, the AMAPE defined in (62) is adopted and compared for all the learning models:

(iii) Standard Deviation(SD) sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n 1X ðy  lÞ2 SDðrÞ ¼ n i¼1 i

yi  MA5ðnÞ : MAðnÞ

where l ¼

n 1X y n i¼1 i

ð65Þ

ð66Þ

The no. of input layer node for the FLANN models has been set to 8 to express the closing index of 5 days ago and the technical indicators mentioned above and the number of output node to 1 for expressing the closing index of 6th day. The 8 dimensional input patterns have been expanded to a pattern of dimension 33 using order 4 and the corresponding basis functions for TRFLANN, LAGFLANN, CHFLANN, and LEFLANN. For CEFLANN using two nonlinear tanh() functions for expansion, the input pattern has been expanded to pattern of 10 dimensions. The same model also is used in the case of the recurrent CEFLANN model with three lagged output terms fed into the input. Initially the performance of each FLANN model having the same network structure and same data sets trained using PSO, HMRPSO, and DE with different mutation strategy has been observed. The same population having size 20 with maximum training times kmax = 100 has been used for all the evolutionary learning based methods. Each individual in the population has width 33 i.e. equal to the no. of weights for training TRFLANN, CHFLANN, LAGFLANN, LEFLANN

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

14

A.K. Rout et al.

Table 1 Performance comparison of different FLANN models with different evolutionary learning methods for 1-day ahead prediction. Type of FLANN

Evolutionary learning method

S&P500

BSE

RMSE test

MAPE test

RMSE test

MAPE test

Recurrent CEFLANN

PSO HMRPSO DE current to best

0.0411 0.0332 0.0308

1.3876 1.0895 1.0197

0.0259 0.0210 0.0152

3.1120 2.4883 1.8180

CEFLANN

PSO HMRPSO DE current to best

0.0441 0.0343 0.0328

1.5158 1.1362 1.0849

0.0282 0.0224 0.0184

3.2834 2.6905 2.2441

LEFLANN

PSO HMRPSO DE current to best

0.0479 0.0438 0.0340

1.7231 1.5243 1.1594

0.0301 0.0272 0.0258

3.3398 3.0531 2.8172

LAGFLANN

PSO HMRPSO DE current to best

0.0567 0.0527 0.0506

2.0437 1.8557 1.7729

0.0622 0.0576 0.0568

7.0308 6.1226 5.9825

CHFLANN

PSO HMRPSO DE current to best

0.0545 0.04200 0.0371

1.9178 1.4166 1.3099

0.0524 0.0425 0.0415

5.7065 4.3535 4.2276

TRFLANN

PSO HMRPSO DE current to best

0.1041 0.0883 0.0738

3.4783 3.0075 2.5679

0.0791 0.0699 0.0688

8.3051 7.2520 7.1153

Recurrent CEFLANN CEFLANN

RTRL Gradient Descent

0.0563 0.0513

2.579 2.456

0.0712 0.0451

5.862 4.685

Bold values are the outputs from the proposed method.

Table 2

Comparison of convergence speed of CEFLANN models with different evolutionary learning methods.

Type of FLANN

Evolutionary learning method

S&P500

BSE

Time elapsed in sec during training

Time elapsed in sec during training

Recurrent CEFLANN

PSO HMRPSO DE current to best

66.28 155.85 49.94

154.26 270.65 148.31

CEFLANN

PSO HMRPSO DE current to best

40.07 108.17 35.09

96.57 229.43 92.65

Bold values are the outputs from the proposed method.

Table 3

MAPE errors during testing of S&P500 data set using different evolutionary FLANN models.

Days ahead

TRFLANN

LAGFLANN

CHFLANN

LEFLANN

CEFLANN

RCEFLANN

RBFNN

WNN

1 3 5 7 10 15

2.8264 3.4751 3.0419 4.2072 5.3751 6.6261

2.1865 2.3116 2.5902 2.8144 3.0774 3.4447

1.7922 2.6464 3.8075 4.8223 6.1739 6.4565

1.4662 2.3718 3.1018 4.1813 5.2114 6.3137

1.1212 1.7768 2.1164 2.4299 2.6965 3.2343

0.9958 1.7672 2.1064 2.4093 2.6951 3.1429

1.8565 2.9070 3.1682 3.2899 4.1729 5.2024

1.9432 3.112 3.323 3.425 4.0867 4.9326

structure using evolutionary methods. But the individual has width 28 i.e. equal to the total no. of associated parameters and the weights when used for training the CEFLANN network. For PSO the values of c1 and c2 are set at 1.9 and the inertia factor has linearly decreased from 0.45 to 0.35. For DE

the mutation scale has been fixed to FU1 ¼ 0:95; FL1 ¼ 0:4 FU2 ¼ 0:8; FL2 ¼ 0:3; and the crossover rate to CrU ¼ 0:9; CrL ¼ 0:6. For HMRPSO the inertia factor has linearly decreased from 0.4 to 0.1. The RMSE error has taken as the fitness function for all the evolutionary learning algorithms.

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network Table 4

15

MAPE errors during testing of BSE data set using different evolutionary FLANN models.

Days ahead

TRFLANN

LAGFLANN

CHFLANN

LEFLANN

CEFLANN

RCEFLANN

RBFNN

WNN

1 3 5 7 10 15

7.1153 7.8761 8.3101 8.7211 9.8088 10.6614

5.9825 6.5576 6.9390 7.2503 7.8726 9.0252

4.2276 5.4677 5.4759 7.4708 7.6583 9.3138

3.8741 5.2244 5.5157 6.4177 8.3432 9.5982

2.2441 4.4055 4.8933 5.2592 6.3007 7.8549

1.8180 3.4747 4.6082 5.2263 6.2692 7.8364

2.8752 4.8619 5.3918 7.7466 7.8579 9.6480

2.924 4.976 5.281 7.612 7.986 9.610

Table 5 Stock market

S&P 500

Variances of the predicted S&P 500 stock indices on specific days. Dates (from 23rd July 2010 to 13th Aug. 2010)

Variances (from 23rd July 2010 to 13th Aug. 2010

Dates (from 29th Oct. 2010 to 19th Nov. 2010)

Variances (from 29th Oct. 2010 to 19th Nov. 2010)

Dates (from 9th Feb. 2011 to 3rd Mar. 2011)

Variances (from 9th Feb. 2011 to 3rd Mar. 2011)

23-07-2010

dvo(400:415) 1.0e03* 0.0399

29-10-2010

dvo(470:485) 1.0e03* 0.2379

09-02-2011

dvo(540:555) 1.0e03* 0.0245

26-07-2010 27-07-2010 28-07-2010 29-07-2010 30-07-2010 02-08-2010 03-08-2010 04-08-2010 05-08-2010 06-08-2010 09-08-2010 10-08-2010 11-08-2010 12-08-2010 13-08-2010

0.0430 0.1077 0.1302 0.1674 0.1168 0.0089 0.0113 0.0101 0.2518 0.4284 0.5153 0.3136 0.0365 0.0520 0.0738

01-11-2010 02-11-2010 03-11-2010 04-11-2010 05-11-2010 08-11-2010 09-11-2010 10-11-2010 11-11-2010 12-11-2010 15-11-2010 16-11-2010 17-11-2010 18-11-2010 19-11-2010

0.3299 0.2383 0.1290 0.0229 0.0321 0.0830 0.0904 0.2536 0.2301 0.1159 0.1184 0.1187 0.1054 0.0630 0.0663

10-02-2011 11-02-2011 14-02-2011 15-02-2011 16-02-2011 17-02-2011 18-02-2011 22-02-2011 23-02-2011 24-02-2011 25-02-2011 28-02-2011 01-03-2011 02-03-2011 03-03-2011

0.0293 0.0270 0.0372 0.1269 0.2620 0.3279 0.2261 0.0789 0.0947 0.0917 0.1228 0.1240 0.1104 0.0877 0.0557

Fig. 2 shows a comparison of RMSE errors of S&P500 and BSE data sets during training of the two better performing FLANN models like the LEFLANN and CEFLANN using PSO, HMRPSO (adaptive version), DE/current to best methods, etc. From these figures it is quite clear that the RMSE converges to the lowest value for the two FLANN models using DE current to best variant during the training period in comparison to other evolutionary methods like the PSO, and HMRPSO. Comparing the performance of each FLANN using various evolutionary learning methods like PSO, HMRPSO, DE with different mutation strategies, it has been observed that each FLANN trained using DE/current to best mutation provides good result than other learning methods. Thus for producing the final forecast of the stock indices the DE current to best is used to reduce the prediction errors of the stock market indices. The RMSE and MAPE values for the two stock indices like the S&P500 and BSE obtained during training with different models and DE current to best evolutionary approach are shown in Fig. 3. From this figure it is seen that the both the CEFLANN and its recurrent version produce the least RMSE and MAPE errors during training in just 20 iterations. The RMSE and MAPE values for all the FLANN models using different evolutionary training paradigms are shown in Table 1, from which it can be concluded that the recurrent CEFLANN produce the least errors when trained with DE current to best technique. Again to compare

the convergence speed of the evolutionary learning algorithms on the CEFLANN models the time elapsed during training is also specified in Table 2. From the table it is also clear that DE method has a faster convergence speed compared to the other two methods. Figs. 4 and 5 show the actual and predicted closing price values of BSE and S&P500 data sets during testing with different FLANN models, and from this figure it is quite clear that the CEFLANN and recurrent adaptation produce accurate forecasts in comparison to all other FLANN models like the TRFLANN,LAGFLANN, CHFLANN, and LEGFLANN, etc. Also the LAG FLANN and TRFLANN show the worst forecast results for one day ahead stock closing price for both S&P500 and BSE stocks. A comparison of the evolutionary approaches with the well known gradient descent algorithm is also shown in Table 1 showing clearly the superior forecasting performance of the former in comparison to the gradient descent algorithms. Further Tables 3 and 4 show the forecast results from one day ahead to 15 days ahead, in which the RCEFLANN has a MAPE varies for nearly 1–3.15% in comparison with CEFLANN and LEGFLANN and LAGFLANN from nearly 1.15–6%, respectively (Fig. 7). Tables 3 and 4 provide a comparison of MAPE values with other neural network models like the RBF neural network (Fig. 6), and Wavelet neural network (WNN), when trained with DE/current to best variant. From these tables it is clearly

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

16 Table 6 Stock Market

BSE

A.K. Rout et al. Variances of the predicted BSE stock indices on specific days. Dates (from 07th Mar. 2007 to 29th Mar. 2007)

Variances (from 07th Mar. 2007 to 29th Mar. 2007)

07-03-2007 08-03-2007 09-03-2007 12-03-2007 13-03-2007 14-03-2007 15-03-2007 16-03-2007 19-03-2007 20-03-2007 21-03-2007 22-03-2007 23-03-2007 26-03-2007 28-03-2007 29-03-2007

dvo(800:815) 1.0e03* 0.0970 0.1210 0.1382 0.1827 0.1359 0.0343 0.1121 0.3349 0.2909 0.1897 0.1107 0.1031 0.0688 0.2121 0.1963 0.1901

Dates (from 19th Dec. 2007 to 11th Jan. 2008)

Variances (from 19th Dec. 2007 to 11th Jan. 2008)

19-12-07 20-12-07 24-12-07 26-12-07 27-12-07 28-12-07 31-12-07 01-01-08 02-01-08 03-01-08 04-01-08 07-01-08 08-01-08 09-01-08 10-01-08 11-01-08

dvo(1000:1015) 1.0e03* 0.8897 0.6140 0.0864 0.0073 0.0321 0.0269 0.0827 0.1456 0.1522 0.1464 0.0482 0.0436 0.0454 0.1863 0.4594 0.7512

apparent that the RCFLANN is the simplest neural model which is quite robust and provides superior forecasting performance when trained with DE algorithm in comparison to the well established neural networks like the RBF and the wavelet neural network. After it is observed that the RCEFLANN performs the best in predicting the future stock indices, it is important to calculate the day to day variances and variances over a given time frame. Fig. 8 depicts the variances showing clearly the accuracy of the forecasting model, since their magnitudes are quite small over a large time frame (more than one year). Further the day to day variances shown in Tables 5 and 6 completely validate this argument. 5. Conclusion Functional Link Artificial Neural Network is a single layer ANN structure, in which the hidden layers are eliminated by transforming the input pattern to a high dimensional space using a set of polynomial or trigonometric basis functions giving rise to a low complexity neural model. Different variants of FLANN can be modeled depending on the type of basis functions used in functional expansion. In this paper, the detailed architecture and mathematical modeling of different FLANNs with polynomial and trigonometric basis functions have been described to predict the stock market return. Further for improving the performance of different FLANNS, a number of evolutionary methods have been discussed to optimize their parameters. Comparing the performance of each FLANN using various evolutionary learning methods like PSO, HMRPSO, DE, etc. with different mutation strategies, it has been observed that each FLANN trained using DE/current to best mutation provides good result in comparison to other learning methods. Again the performance comparison of various evolutionary FLANN models to predict stock prices of Bombay Stock Exchange and Standard & Poor’s 500 data set for 1 day, 7 days, 15 days and 30 days ahead, shows that performance of stock

Dates (from 29th Jul. 2008 to 20th Aug. 2008)

Variances (from 29th Jul. 2008 to 20th Aug. 2008)

29-07-08 30-07-08 31-07-08 01-08-08 04-08-08 05-08-08 06-08-08 07-08-08 08-08-08 11-08-08 12-08-08 13-08-08 14-08-08 18-08-08 19-08-08 20-08-08

dvo(1150:1165) 1.0e03* 0.3438 0.2138 0.2537 0.1802 0.1679 0.1247 0.0861 0.0821 0.2340 0.3754 0.2549 0.1315 0.1107 0.0972 0.0783 0.0732

price prediction can significantly be enhanced using CEFLANN and recurrent CEFLANN models trained with DE/current to best method in comparison with the well known RBF and WNN models. The recurrent CEFLANN exhibits very low variances and can also be used to predict the volatility of stock indices and the trends for providing trading decisions. References Abdual-Salam, Mustafa E., Abdul-Kader, Hatem M., Abdel-Wahed, Waiel F., 2010. Comparative study between differential evolution and particle swarm optimization algorithms in training of feedforward neural network for stock price prediction. In: 7th International Conference on Informatics and Systems (INFOS), pp. 1–8. Ampolucci, P., Uncini, A., Piazza, F., Rao, B.D., 1999. On-line learning algorithms for locally recurrent neural networks. IEEE Trans. Neural Networks 10 (2), 253–270. Andrade de Oliveira, Fagner, Nobre, Cristiane Neri, 2011. The use of artificial neural networks in the analysis and prediction of stock prices. In: IEEE International Conference on Systems, Man and Cybernetics (SMC), pp. 2151–2155. Chakravarty, S., Dash, P.K., 2009. Forecasting stock market indices using hybrid network. In: World Congress on Nature & Biologically Inspired Computing (NaBIC), pp. 1225–1230. Chandra, Patra Jagdis, Bornande, C., Meher, P.K., 2009. Laguerre neural network based smart sensors for wireless sensor networks. In: IEEE Conference on Instrumentation and Measurement Technology, pp. 832–837. Das, K.K., Satapathy, J.K., 2011. Legendre neural network for nonlinear active noise cancellation with nonlinear secondary path. In: International Conference on Multimedia, Signal Processing and Communication Technologies, pp. 40–43. Dehuri, S., Roy, R., Cho, S.B., Ghosh, A., 2012. An improved swarm optimized functional link artificial neural network (ISO-FLANN) for classification. J. Syst. Software 85, 1333–1345. Gao, Hao, Xu, Wenbo, 2011. A new particle swarm algorithm and its globally convergent modifications. IEEE Trans. Syst. Man Cybernet. 41 (5), 1334–1351.

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Low Complexity Recurrent Neural Network George, Nithin V., Panda, Ganapati, 2012. A reduced complexity adaptive Legendre neural network for nonlinear active noise control. In: 19th International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 560–563. Hatem, A.K., Mustafa, A.S., 2012. Evaluation of differential evaluation and particle swarm optimization algorithms at training of neural network for stock prediction. Int. Arab J. e-Technol. 2 (3), 145–151. Jiang, L.L., Maskell, D.L., Patra, J.C., 2012. Chebyshev functional link neural network-based modeling and experimental verification for photovoltaic arrays. In: IEEE International Joint Conference on Neural Networks, pp. 1–8. Kozarzewski, B., 2010. A neural network based time series forecasting system. In: 3rd International Conference on Human System Interaction, Rzeszow, Poland, pp. 59–62. Lee, Chun-Teh, Chen, Yi-Ping, 2007. The efficacy of neural networks and simple technical indicators in predicting stock markets. In: International Conference on Convergence Information Technology, pp. 2292–2297. Li, Mingyu, Liu, Jinting, Jiang, Yang, Feng, Wenjiang, 2012. Complex Chebyshev functional link neural network behavioral model for broadband wireless power amplifiers. IEEE Trans. Microw. Theory Tech. 60 (6), 1979–1989. Lin, QianYu, Feng, ShaoRong, 2010. Stock market forecasting research based on neural network and pattern matching. In: International Conference on E-Business and E-Government, pp. 1940–1943. Lu, Bo, 2011. A stock prediction method based on PSO and BP hybrid algorithm. In: International Conference on E-Business and EGovernment, pp. 1–4. Ma, Weimin, Wang, Yingying, Dong, Ningfang, 2010. Study on stock price prediction based on BP neural network. In: IEEE International conference on Emergency Management and Management Sciences (ICEMMS), pp. 57–60. Majhi, B.D., Shalabi, H., Fathi, M., 2005. FLANN based forecasting of S&P 500 index. Inf. Technol. J. 4 (3), 289–292. Mili, Faissal, Hamdi, Manel, 2012. A hybrid evolutionary functional link artificial neural network for data mining and classification. Int. J. Adv. Comput. Sci. Appl. 3 (8), 89–95. Mishra, S.K., Panda, G., Meher, S., 2009. Chebyshev functional link artificial neural networks for denoising of image corrupted by salt and pepper noise. Int. J. Recent Trends Eng. 1 (1), 413–417. Mohapatra, P., Raj, A., Patra, T.K., 2012. Indian stock market prediction using differential evolutionary neural network model. Int. J. Electron. Commun. Comput. Technol. 2 (4), 159–166.

17 Naeini, Mahdi Pakdaman, Taremian, Hamidreza, Hashemi, Homa Baradaran, 2010. Stock market value prediction using neural networks. In: International Conference on Computer Information Systems and Industrial Management Applications (CISIM), pp. 132–136. Nanda, Santosh Kumar, Tripathy, Debi Prasad, Mahapatra, S.S., 2011. Application of legendre neural network for air quality prediction. In: The 5th PSU-UNS International Conference on Engineering and Technology (ICET-2011), pp. 267–272. Pao, Y.-H., 1989. Adaptive Pattern Recognition and Neural Networks. Addison-Wesley Publishing Co., Inc., Reading, MA (US). Patra, J.C., Bornand, C., 2010. Nonlinear dynamic system identification using Legendre neural network. In: International Joint Conference on Neural Networks (IJCNN), pp. 1–7. Patra, J.C., Thanh, N.C., Meher, P.K., 2009. Computationally efficient FLANN-based intelligent stock price prediction system. In: Proceedings of International Joint Conference on Neural Networks, pp. 2431–2438. Qin, A.K., Huang, V.L., Suganthan, P.N., 2008. Differential evolution algorithm with strategy adaptation for global numerical optimization. IEEE Trans. Evol. Comput. 13 (2), 398–417. Rodriguez, Nibaldo, 2009. Multiscale Legendre neural network for monthly anchovy catches forecasting. In: Third International Symposium on Intelligent Information Technology Application, pp. 598–601. Sanjeevi, Sriram G., Naga Nikhila, A., Khan, Thaseem, Sumathi, G., 2011. Hybrid PSO-SA algorithm for training a neural network for classification. Int. J. Comput. Sci. Eng. Appl. 1 (6), 73–83. Song, Ying, Chen, Zengqiang, Yuan, Zhuzhi, 2007. New chaotic PSObased neural network predictive control for nonlinear process. IEEE Trans. Neural Networks 18 (2), 595–600. Tahersima, Hanif, Tahersima, Mohammad Hossein, Fesharaki, Morteza, 2011. Forecasting stock exchange movements using neural networks: A case study. In: International Conference on Future Computer Sciences and Application, pp. 123–126. Wang, Y., Cai, Z., Zhang, Q., 2011. Differential evolution with composite trial vector generation strategies and control parameters. IEEE Trans. Evol. Comput. 15 (1), 55–66. Yogi, S, Subhashini, K.R., Satapathy, J.K., 2010. A PSO based functional link artificial neural network training algorithm for equalization of digital communication channels. In: 5th International Conference on Industrial and Information Systems, ICIIS, pp. 107–112.

Please cite this article in press as: Rout, A.K. et al., Forecasting financial time series using a low complexity recurrent neural network and evolutionary learning approach. Journal of King Saud University – Computer and Information Sciences (2015), http://dx.doi.org/10.1016/j.jksuci.2015.06.002

Suggest Documents