Document not found! Please try again

Neural Networks and Exponential Smoothing Models for Symbolic ...

1 downloads 0 Views 188KB Size Report
Neural networks and exponential smoothing models for symbolic interval time series processing – applications in stock market. André Luis Santiago Maia and ...
Eighth International Conference on Hybrid Intelligent Systems

Neural networks and exponential smoothing models for symbolic interval time series processing – applications in stock market Andr´e Luis Santiago Maia and Francisco de A. T. De Carvalho Universidade Federal de Pernambuco Centro de Inform´atica Av. Prof. Luiz Freire, s/n - Cidade Universit´aria CEP: 50740-540 - Recife - PE - Brasil {alsm3, fatc}@cin.ufpe.br Abstract

are described by interval-valued variables. Therefore, tools for interval-valued data analysis are very much required. Nowadays, different approches have been introduced to analyse interval-valued data. Several authors have considered neural networks models in order to manage intervalvalued data. Pati˜no-Escarcina et al. [15] propose a one layer perceptron for classification tasks, where inputs, weights and biases are represented by intervals. Roque et al. [17] propose and analyse a new model of multilayer perceptron based on interval arithmetic that facilitates handling input and output interval data, but where weights and biases are single-valued and not interval-valued. Others authors have shown success with interval-valued data, but on interval analysis approach. In this paper, we manage interval-valued time series in the context of SDA, without use of operations and functions of interval arithmetic. This is a main feature that differs our paper from those cited above. In the field of SDA, Billard and Diday [2] have introduced central tendency and dispersion measures suitable for interval-valued data. Cazes et al. [6] and Lauro and Palumbo [12] introduced principal component analysis methods suitable for interval-valued data. Concerning supervised classification, Ichino et al. [9] have introduced a symbolic classifier as a region oriented approach for interval-valued data. Symbolic Data Analysis also provides a number of clustering methods for interval-valued data (see Chavent et al. [7] and De Carvalho [8]). These methods differ in the type of the symbolic data considered, their cluster structures and/or the clustering criteria considered. Linear regression models have been also considered by Billard and Diday [1] and Lima Neto and De Carvalho [14]. In context of interval time series, recently, Maia et al. [13] propose approaches to interval-valued time series forecasting based on ARIMA and neural networks models and also on a hybrid methodology that combines both ARIMA and neural net-

The need to consider data that contain information that cannot be represented by classical models has led to the development of Symbolic Data Analysis (SDA). As a particular case of symbolic data, symbolic interval time series are interval-valued data which are collected in a chronological sequence through time. This paper presents two approaches to symbolic interval time series analysis. The first approach is based on artificial neural networks. The second, is a new model based on exponential smoothing methods, where the smoothing parameters are estimated by using techniques for nonlinear optimization problems with bound constraints. The practicality of the methods is demonstrated by applications on real interval time series.

1. Introduction Interval-valued data has been also considered in the field of Symbolic Data Analysis (SDA) (Bock and Diday [4]). This field, related to multivariate analysis, pattern recognition and artificial intelligence, aims to extend classical exploratory data analysis and statistical methods to symbolic data. Symbolic data allows multiple (sometimes weighted) values for each variable and new variable types (set-valued, interval-valued, and histogram-valued variables) have been introduced. These new variables make it possible to take into account the variability and/or uncertainty present in the data. In the field of SDA, interval-valued data appear mainly when the observed values of the variables are intervals of the set of real numbers IR. They arise in situations such as recording monthly interval temperatures in meteorological stations, daily interval stock prices, etc. Another source of interval-valued data is the aggregation of huge data-bases in a reduced number of groups whose properties

978-0-7695-3326-1/08 $25.00 © 2008 IEEE DOI 10.1109/HIS.2008.50

326

works models, where it is fitted two models, respectively, on the mid-point and range of the interval values assumed by the interval-valued time series. This paper presents two approaches to symbolic interval time series analysis: an approach based on artificial neural networks and an another one based on exponential smoothing methods. It is organized as follows. In Section 2, we present a brief introduction of symbolic interval time series, the neural network and the interval exponential smoothing model able to handle interval-valued time series. Applications to financial interval time series processing are given in Section 3. Finally, Section 4 offers concluding remarks.

one output layer) connected acyclically are often used for modeling and forecasting time series. In MLP networks, the relation between the output, yt , and inputs, yt−1 , yt−2 ,. . ., yt−p , is as follows   q p   αj · g β0j + βij yt−i yt = α0 + j=1

i=1

where α0 and β0j ’s denote, respectively, the weights of the connection between the constant input (bias) and the output and between the bias and hidden nodes and αj ’s and βij ’s are weights associated with each node; p is the number of inputs and q is the number of hidden nodes; and g denotes the transfer function used at the hidden layer. Transfer functions such as the logistic,

2. Symbolic interval time series processing Tools for interval-valued time series data analysis are also very much required. In the framework of Symbolic Data Analysis, this paper introduces models to forecast interval-valued time series. When interval data is collected in an ordered sequence against time, we say that we have a symbolic interval time series. We considered in the development of this work, that at each instant of time (t = 1, 2, . . . , n), the interval is described by a twodimensional vector with elements in IR represented by upper bound and by lower bound, respectively, XtU and XtL , with XtL ≤ XtU . Thus, a symbolic interval time series is as

g(u) =

1 1 + exp(−u)

(1)

are commonly used for time series data because they are nonlinear and continuously differentiable which are desirable properties for network learning (Kaastra and Boyd [11]).

Figure 1. Neural network for interval time series processing with 2p inputs, one hidden layer of q nodes, and two outputs.

[X1L ; X1U ], [X2L ; X2U ], . . . , [XnL ; XnU ] where n denote the number of intervals of the time series (sample size). Specifically, an observed interval at time t is noted It and it is represented as  U  Xt It = XtL Symbolic interval time series can be observed in various areas. For example, in economics (family’s monthly income measured at the lower and higher wages of home, daily stock price of a company that can be expressed by lowest traded price during the day and by highest traded price during the day), engineering (variation of electric current expressed by lower and bigger intensity in a given day), medicine (diastolic and systolic blood pressure), weather (rainfall, minimum and maximum in the month, in a particular place and relative humidity also measured by the minimum and maximum in a given month).

For symbolic interval time series processing and forecasting, we use a MLP network with two feedforward layer with 2p inputs (the lagged intervals at t − 1, . . . , t − p), q nodes in the hidden layer and two outputs nodes, each output corresponding to bound forecasting, XtL and XtU . Figure 1 shows a typical MLP neural network structure suitable for symbolic interval time series processing. There is a constant input unit, called a bias node, connected to every node in the hidden layer and also to the outputs nodes. In this paper, the neural network presented is a MLP network for symbolic interval time series processing (MLPI ) and it is based on typical MLP networks for time series data. The transfer function used at the hidden layer is given by (1).

2.1. Neural network for symbolic interval time series There exists a huge variety of different ANN types, but the most popular one is the multilayer perceptron (MLP). In particular, MLP networks with two layers (one hidden and

327

Let ϕij and ωj the matrices,  ϕij =

ϕU ij ϕL ij



 ωj =

and

ωjU ωjL

current observation is ignored entirely, and the smoothed value consists entirely of the previous smoothed value. The smoothing parameter α constrained to the range (0, 1) can be estimated by minimizing the sum of squared one-step-ahead forecast errors over the period of fit,



for a MLPI network with one hidden layer of q nodes, the general prediction equation for computing a forecast of XtL and XtU (two outputs) using selected past intervals, It−1 , . . . , It−p , as the inputs, may be written in the form It =



 



XtU XtL

ω0U

 = ω0 +

q 

 ωj · g

ϕ0j +

j=1





q

ωjU

p 

U ϕU ij Xt−i

The starting value for y2 is typically taken as y1 . More details are given by Gardner [10]. The interval exponential smoothing (ESI ) method follows the ES representation (3) for usual quantitative data and has the form

ϕ ij It−i

p

L ϕL ij Xt−i



⎤

· g ϕ0j + + ⎥ ⎢ + ⎥ ⎢ j=1 i=1 ⎥ ⎢   =⎢ q p ⎥   ⎦ ⎣ω L + U L L ωjL · g ϕ0j + ϕU ij Xt−i + ϕij Xt−i 0 j=1

It+1 = AIt + (I − A)It

(4)

where A denotes the (2 × 2) smoothing parameters matrix and I is an (2 × 2) identity matrix, that is     α11 α12 1 0 A= and I= α21 α22 0 1

i=1

where ϕ’s denote the weights for the connections between the inputs (lagged intervals) and the hidden nodes, and ω’s denote the weights between the hidden nodes and the outputs nodes. We train the MLP using conjugate gradient error minimisation. Conjugate gradient approach finds the optimal weight vector along the current gradient by doing a linesearch. Details about the conjugate gradient algorithm can be found in Bishop [3].

Thus, expanding the matricial expression (4), one can show that the ESI model has the form     U  U α11 α12 X Xt t+1  It+1 = = L  α α XtL Xt+1 21 22     tU 1 − α11 −α12 X + tL −α21 1 − α22 X    U + α12 eL α11 XtU + (1 − α11 )X t t = (5)  L + α21 eU α22 XtL + (1 − α22 )X t t

2.2. Interval exponential smoothing model A effective method often utilized for time series forecasting is that called exponential smoothing (ES). The ES computes the one-step-ahead forecast by a formula that is equivalent to computing a geometric sum of past observations, namely

U L L U L where eU t = Xt − Xt and et = Xt − Xt correspond, respectively, to the forecast error of the upper and lower bounds at time t. Note that the one-step-ahead forecast of the upper bound uses information relative to the lower bound through of eL t and the one-step-ahead forecast of the lower bound uses information relative to the upper bound through of eU t . Also, when the diagonal elements of A are close to one, the ES model provides forecasts that depends on the most recent intervals whereas if the off-diagonal ele U is estimate independently ments of A are equal to zero, X t tL is estimate independently of XtU . of XtL and X In symbolic interval time series, the smoothing parameters matrix A with elements constrained to the range (0, 1) can be estimated by minimizing the interval sum of squared one-step-ahead forecast errors,

yn+1 = αyn + α(1 − α)yn−1 + α(1 − α)2 yn−2 + · · · (2) where α denote the smoothing parameter in the range (0, 1). Thus, exponential smoothing is a weighted average of past values of an observed process which gives exponentially decreasing weights to past values. In other words, recent observations have relatively more weight in forecasting than the older observations: α > α(1 − α) > α(1 − α)2 > . . .. In practice, (2) is rewritten on the equivalent updating format, as follows yt+1 = αyt + (1 − α) yt

(yt − yt )2

t=2



i=1



n 

(3)

Thus, the estimate is a weighted average of the new observation and the previous estimate. Notice that if α = 1, then the previous estimate are ignored entirely and if α = 0, then the

S2 (A)

=

n  t=2

328

(It − It ) (It − It )

=

n  t=2

=

n 



tU XtU − X L L Xt − X t

 

tU )2 + (XtU − X

t=2

n 

tU XtU − X L L Xt − X t



3.1. Measuring accuracy of the methods In this section we introduce ways of measuring prediction accuracy for symbolic interval time seres. The ideas from these diagnostic check are helpful in the problem we consider here, namely comparing the relative accuracy of different prediction methods on the same interval time series. The diagnostic checks presented look at the interval residuals give by eIt = It − It

tL )2 (XtL − X

t=2

The starting values for I2 are typically taken as I1 . The subscript index 2 in S2 (A) is because we use the l2 -norm formulation (residual sum of squares). In counterpart, we (residual sum abcan use as objective the l1 -norm function n  U | + n |X L − X  L |. solute), S1 (A) = t=2 |XtU − X t t t t=2 Using (5), we rewrite the objective function S2 (A) as S2 (αij ) = =

n   t=2 n

+

U U t−1 XtU − α11 Xt−1 − (1 − α11 )X − α12 eL t−1



In this paper, the performance evaluation of the presented interval-valued time series forecasting models, MLPI network and ESI model, is accomplished through the measures interval mean square error (MSEI ) and interval mean absolute error (MADI ), namely

2

L L t−1 XtL − α22 Xt−1 − (1 − α22 )X − α21 eU t−1

2 MSEI

t=2

αij

m   1  jU )2 + (XjL − X jL )2 , (XjU − X 2m j=1

for i, j = 1, 2. Based on the above objective function, we can write the estimation of A formally as a constrained nonlinear programming problem formulated as min S2 (αij ),

=

MADI

=

m   1  jU | + |XjL − X jL | |XjU − X 2m j=1

subject to 0 ≤ αij ≤ 1.

where m denotes the number of fitted intervals. Notice that in the computation of the MSEI and MADI measures are considered simultaneously, the upper bound error and the lower bound error.

The solution for this problem can be obtained by the LBFGS-B (limited memory algorithm for bound constrained optimization) method developed by Byrd et al. [5]. This method allows box constraints, that is each parameter can be given a lower and upper bound. It is based on the gradient projection method and a quasi-Newton approach is used, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm. The main goal of the BFGS algorithm is to use an approximate Hessian instead of the true Hessian. Nocedal and Wright [16] is a comprehensive reference for the previous algorithms. The L-BFGS-B algorithm is implemented in the R software package (R Development Core Team [18]).

3.2. Applications in stock market The applications in finance of the models are accomplished in symbolic interval time series of stock market around days.1 The time series used correspond to the stock prices, where the intervals are obtained for daily ranges, that is, the lowest traded price during the day (lower bound price, namely X L ) and the highest traded price during the day (upper bound price, namely X U ) are computed in order to gauge the movement in the market for that day. We use interval time series of stock prices of some companies in various segments of the market, namely: the Google, Inc. (Google), the Microsoft Corporation (Microsoft), the Petr´oleo Brasileiro S.A. (Petrobras), the Gol Linhas A´ereas Inteligentes S.A. (Gol), the Companhia Vale do Rio Doce (Vale), the Coca-Cola Company (Cocacola), the Banco Bradesco S.A. (Bradesco), the General Motors Corporation (GM), the Hollywood Media Corp. (Hollywood), the International Business Machines Corporation (IBM), the the Banco Ita´u Holding Financeira S.A. (Itau), the Motorola, Inc. (Motorola), TAM S.A. (TAM), the Brasil Telecom S.A. (Telecom) and the Wal-Mart Stores, Inc. (Wal-Mart). Table 1 presents the features of the analyzed

3. Empirical results This section shows the usefullness of the presented methods through applications on real symbolic interval time series concerning financial areas. In addition, we also fitted to interval time series the hybrid model proposed by Maia et al. [13] that combines both ARIMA and neural networks models, where it is fitted two models, respectively, on the mid-point (Xtc ) and range (Xtr ) of the interval time series, based on Zhang’s proposal [19]. According to the ARIMA residuals of the mid-point and half-range adjustments, respectively, we have two new series: eXtc and eXtr . This two new series are fitted by neural networks.

1 Available

329

in http://finance.yahoo.com (March 2008).

interval time series considered. The MLPI networks was superior to ESI model only in Itau, Vale and TAM time series. The results show that hybrid model (Maia et al. [13]) forecast macroeconomic time series more accurately than MLPI and ESI models. For illustration purpose, Figure 2 presents fitted intervals for Vale interval time series processing. In this figure, each vertical line segment represents an observed interval; the extremes correspond to the upper bound and to the lower bound.

time series. The sample sizes represent typical lengths of data in financial time series applications.

Table 1. Symbolic interval time series processed. Series Itau Vale Hollywood Petrobras Bradesco Telecom Google TAM Gol Coca-cola Microsoft Wal-Mart GM IBM Motorola

Period October 9, 2007 to March 17, 2008 September 13, 2007 to March 13, 2008 July 9, 2007 to March 17, 2008 July 2, 2007 to March 13, 2008 April 3, 2007 to March 17, 2008 January 10, 2007 to March 17, 2008 August 10, 2006 to March 10, 2008 March 10, 2006 to March 17, 2008 December 13, 2005 to March 13, 2008 November 19, 2004 to March 17, 2008 February 18, 2003 to March 13, 2008 February 25, 2000 to March 17, 2008 March 29, 1989 to March 17, 2008 June 15, 1987 to March 17, 2008 January 3, 1977 to March 17, 2008

Sample size 109 126 174 177 240 297 397 507 565 834 1277 2024 4782 5234 7874

Figure 2. Symbolic interval time series processing. Plot (a) shows Vale do Rio Doce interval time series; Plot (b) shows the fitted intervals by MLPI network; Plot (c) shows the fitted intervals by ESI model; and Plot (d) shows the fitted intervals by hybrid model (Maia et al. [13]).

35 40

60

80

100

120

Series Itau Vale Hollywood Petrobras Bradesco Telecom Google TAM Gol Coca-cola Microsoft Wal-Mart GM IBM Motorola

ESI

0.439 0.964 0.010 11.008 0.593 0.720 116.885 0.673 0.799 0.264 0.136 0.580 1.062 8.086 4.380

0.566 1.092 0.011 8.647 0.560 0.542 68.961 0.580 0.667 0.134 0.108 0.568 0.785 7.036 3.468

40

60

(b)

(d)

80

100

120

80

100

120

30

Fitted intervals

25

30

Fitted intervals

20

Time

25

MSEI MLPI

0

Time

35

20

35

0

Table 2. Comparison for the stock prices interval time series.

30

Fitted intervals

25

30 25

Real stock prices

(c)

35

(a)

In general, the MLPI networks are trained with 20 inputs, p = 10 (ten lagged intervals). The selection of the best number of hidden units, q, in the MLPI network involves experimentation. In this paper, a group of neural networks with different numbers of hidden units are trained and each is evaluated on the interval time series using fifty iterations of the conjugate gradient algorithm where the initial weights were randomly generated. To determine what is the best number of hidden units we used the mean square error of the fifty iterations.

0

20

40

60 Time

80

100

120

0

20

40

60 Time

MADI Hybrid 0.391 0.776 0.009 5.529 0.442 0.452 51.769 0.533 0.595 0.116 0.099 0.524 0.762 4.615 2.919

MLPI 0.540 0.796 0.079 2.534 0.605 0.650 7.748 0.637 0.676 0.370 0.241 0.535 0.716 1.408 1.079

ESI 0.589 0.832 0.075 2.113 0.575 0.543 5.621 0.580 0.625 0.254 0.208 0.518 0.612 1.157 0.815

Hybrid 0.483 0.687 0.071 1.763 0.524 0.497 5.133 0.555 0.593 0.245 0.204 0.507 0.606 1.129 0.794

The results of experiments in symbolic interval time series processing are presented in Table 2. Notice that the ESI model outperform the MLPI network for the majority of the

330

4. Conclusions This paper presented two methods for symbolic interval time series processing. The first approach is based on artificial neural networks. The second is a new model based on exponential smoothing method, where the smoothing parameters are estimated by using techniques for nonlinear optimization problems with box constraints. The practicality of the methods is demonstrated by applications on real financial time series. The time series used correspond to the stock prices, where the intervals are obtained for daily ranges, that is, for the lowest traded price during the day (lower bound price, namely X L ) and for the highest traded price during the day (upper bound price, namely X U ). An important result of this paper is that the methods presented can be an alternative especially useful to

stock prices modelling and forecasting, due to the fact that financial time series prediction is a challenge for econometricians. The evaluation of the three methods was accomplished through the average behaviour of the interval mean square error and the interval mean absolute error measures. The results of the applications showed the superiority of the hybrid model (Maia et al. [13]) on the interval-valued time serie of historical stocks prices of the companies considered. Despite the neural networks are a powerful tool for financial univariate time series prediction due to the ability to detect complex nonlinear relationships among variables, the results suggest that the ESI model outperforms the MLPI for these stock prices interval time series processing. Acknowledgments: The authors would like to thanks CNPq, CAPES and FACEPE (Brazilian Agencies) for their financial support.

[13] A. L. S. Maia, F. A. T. D. Carvalho, and T. B. Ludermir. Forecasting models for symbolic interval time series. Neurocomputing, in press. [14] E. A. L. Neto and F. A. T. de Carvalho. Centre and range method for fitting a linear regression model to symbolic interval data. Computational Statistics and Data Analysis, 52:1500–1515, 2008. [15] R. E. P. no Escarcina, B. R. C. Bedregal, and A. Lyra. Interval computing in neural networks: One layer interval neural networks. In Proceedings of the 7th International Conference on Information Technology, CIT 2004, Hyderabad, India, pages 68–75. IEEE, 2004. [16] J. Nocedal and S. J. Wright. Numerical Optimization. Springer, Heidelberg, 1999. [17] A. M. Roque, C. Mat´e, J. Arroyo, and . Sarabia. imlp: applying multi-layer perceptrons to interval-valued data. Neural Processing Letters, 25:157–169, 2007. [18] R. D. C. Team. R: A Language and Environment for Statistical Computing. http://www.R-project.org, Vienna, Austria, 2006. [19] G. Zhang. Time series forecasting using a hybrid arima and neural network model. Neurocomputing, 50:159–175, 2003.

References [1] L. Billard and E. Diday. Regression analysis for intervalvalued data. In H. K. et al, editor, Data Analysis, Classification and Related Methods, pages 369–374. Springer, 2000. [2] L. Billard and E. Diday. From the statistics of data to the statistics of knowledge: symbolic data analysis. Journal of the American Statistical Association,, 98:470–487, 2003. [3] C. M. Bishop. Neural Networks for Pattern Recognition. Oxford Press University, Oxford, 1995. [4] H. Bock and E. Diday. Analysis of symbolic data: explanatory methods for extracting statistical information from complex data. Springer, Heidelberg, 2000. [5] R. H. Byrd, P. Lu, J. Nocedal, and C. Zhu. A limited memory algorithm for bound constrained optimization. SIAM Journal Scientific Computing, 16:1190–1208, 1995. [6] P. Cazes, A. Chouakria, E. Diday, and Y. Schektman. Extension de l’analyse en composantes principales a des donn´es de type intervalle. Revue de Statistique Appliquee, 24:5–24, 1997. [7] M. Chavent, F. A. T. de Carvalho, Y. Lechevallier, and R. Verde. New clustering methods for inteval data. Computational Statistics, 21:211–229, 2006. [8] F. de Carvalho. Fuzzy c-means clustering methods for symbolic interval data. Pattern Recognition Letters, 28(4):423– 437, 2007. [9] M. Ichino, H. Yaguchi, and E. Diday. A fuzzy symbolic pattern classifier. In E. D. et al, editor, Ordinal and Symbolic Data Analysis, pages 92–102. Springer, 1996. [10] E. S. G. Jr. Exponential smoothing: The state of the art. Journal of Forecasting, 4:1–28, 1985. [11] Kaastra and M. Boyd. Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10:215–236, 1996. [12] C. N. Lauro and F. Palumbo. Principal component analysis of interval data: a symbolic data analysis approach. Computational Statistics, 15:73–87, 2000.

331

Suggest Documents