Financial time series data forecasting by Wavelet and TSK fuzzy rule ...

1 downloads 0 Views 305KB Size Report
Sep 25, 2006 - In this study, a novel approach by integrating the wavelet and Takagi-Sugeno-Kang (TSK) fuzzy rule based systems (FRBS) for financial time ...
Financial time series data forecasting by Wavelet and TSK fuzzy rule based system Pei-Chann Chang12*, Chin-Yuan Fan2 , Shih-Hsin Chen2 1 2

Department of Information Management, Yuan Ze University, Taoyuan 32026, Taiwan, R.O.C. Department of Industrial Engineering and Management, Yuan Ze University, Taoyuan 32026, Taiwan, R.O.C. *

E-mail:[email protected]

Abstract In this study, a novel approach by integrating the wavelet and Takagi-Sugeno-Kang (TSK) fuzzy rule based systems (FRBS) for financial time series data prediction is developed. The wavelet method is applied to eliminate the noises caused by random fluctuations. The data output from the wavelet is then input to the TSK fuzzy rule system for prediction of the future value of a time series data. Through the intensive experimental tests, the model has successfully forecasted the price variation for stocks with accuracy close to 97.6% in TSE index.

1. Introduction Mining stock market tendency is a very challenging task due to its high volatility and noisy environment. Stocks and futures traders have been dependent heavily upon various types of intelligent systems to make trading decisions. However, the financial market is a complex, evolutionary, and non-linear dynamical system (AbuMostafa and Atiya, 1996) and it can be described as taking series of obtainable data x , . . . , x , x , x and predict data value of x t + 1 , x t + 2 , .. . in the future. The prediction of future time series values based on past and present information is very useful. Fuzzy systems, due to their universal approximation property, constitute a good framework for modeling complex and highly nonlinear systems. However, the existing fuzzy modeling methods follow a deterministic approach. Hence, they tend to interpret the inconsistencies in the training data as noise. In this study, a new approach is proposed as a modification to a standard fuzzy modeling method based on the data preprocessing scheme, and a wavelet method is applied to eliminate the noises caused by random fluctuations. This framework combined several artificial intelligence technologies such as wavelet transform, neural network, and fuzzy logic. In addition to developing the prediction framework, the wavelet denoising method is also emphasized and analyzed in this paper. The simulation was based on the time series data from the Taiwan Stock market. t−n

t−2

t −1

t

The rest of the paper is organized as follows: in Section 2 related literature in wavelet theory, fuzzy rule based system and different stock forecasting methods are surveyed and discussed. Section 3 gives an application level description of the test-bed application; Section 4 presents an empirical evaluation of the results obtained with the application. Finally, conclusions and directions for future work are covered in section 5.

2. Literature Review Prediction of stock price variation is a very difficult task and the price movement behaves more like a random walk and time varying. During the last decade, stockbrokers and future traders have come to rely upon various types of intelligent systems to make trading decisions. Recently, artificial neural networks (ANNS) have been applied to this area (Aiken and Bsat, 1999; Chang et al. 2004; Chi et al., 1999; Kimoto and Asakawa, 1990; Lee, 2001; Yao and Poh, 1995; Yoon and Swales, 1991). These models, however, have their limitations owing to the tremendous noise and complex dimensionality of stock price data and besides, the quantity of data itself and the input variables may also interfere with each other. Therefore, the result may be not that convincing. Other soft computing (SC) methods are also applied in the prediction of stock price and these SC approaches are to use quantitative inputs, like technical indices, and qualitative factors, like political effects, to automate stock market forecasting and trend analysis. Aiken and Bsat (1999) use a FFNN trained by a genetic algorithm (GA) to forecast three-month U.S. Treasury Bill rates. They conclude that an NN can be used to accurately predict these rates. Abraham et al. (2001) investigated hybridized SC techniques for automated stock market forecasting and trend analysis. They used principal component analysis to preprocess the input data, a NN for one-dayahead stock forecasting and a neuro-fuzzy system for analyzing the trend of the predicted stock values. Abraham et al. (2003) investigate how the seemingly chaotic behavior of stock markets could be well

Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007) 0-7695-2874-0/07 $25.00 © 2007

represented using several connectionist paradigms and soft computing techniques. They concluded that all the connectionist paradigms considered could represent the stock indices behavior very accurately. Recently, neuron-fuzzy networks have been demonstrated to be successful applications in various areas such as in Chang et al. (2005), Chang et al. (2006a), Chang et al. (2006b). Two typical types of neuron-fuzzy networks are Mamdani-type (Wang and Mendel, 1992) and TSK-type models (Takagi and Sugeno, 1985).

during transform process and not changing their real range. After understanding the math equation, the wavelet transform process can be described in Fig. 1. Consider input signal F (t ) , the multi resolution decomposition of the signal can be defined as: F (t ) = S 1 ( t ) + D1 ( t ) + .... + SJ ( t ) + DJ ( t )

SJ ( t ) = ∑ S J , K φ J , K ( t ) k

DJ ( t ) = ∑ DJ , kψ J , K ( t )

(3)

k

3. Methodology

F (t)

The Taiwan Stock Exchange ( TSEC ) has been operating since 1962. At the end of January, 2005, TSEC had 699 listed companies with market capital topping NT$13.7 trillion (USD396 billion), ranking it 10th among the world’s stock exchanges. In addition to stocks, TSEC also trades government bonds, corporate bonds, mutual funds, warrants and depositary receipts. The trading value of stocks amounted to NT$1,155 billion (USD 33.5 billion) as of January 2005 with average daily trading value reaching NT$51 billion (USD1.48 billion). Most stock trading goes to the listed IT companies. The trading value of TSEC stock market places it in the top ten of stock exchanges in the world. Detailed procedures of the hybrid model by integrating a wavelet and Takagi and Sugeno Fuzzy System Forecasting Model for Taiwan Stock Market index prediction are explained in the following sections:

3.1. Data Preprocessing by Wavelet Theory Haar Wavelet is a wavelet evolved from “Continue Wavelet Transform.” [21][22].The equation ψ (t ) presents a signal mother wavelet as shown in Figure 1. 1 ⎛ t − b ⎞ , a, b ∈ R ψ ab ( t ) = ψ ⎜ ⎟ ⎝ a ⎠ a (1) In equation (1), a represents the parameter of observation scale (e.g. stock index number) and b represents the parameter of parallel scale (e.g. stock index number moving time). This equation tries to transform single signal (a,b value) to single wavelet ( ψ ab ( t ) ).However, time series data are shown as continuous data rather than discrete data. After dilation and translation, the mother wavelet will become a series equation like equation (2): 1 ⎛ t − b ⎞ dt = f ,ψ CWT ( a, b ) = ∫ ψ *⎜ ⎟ (2) ⎝ a ⎠ a ∞

ab

f

_∞

In equation (2), multiplied a number

1 a

is constrict the

data norm. This process retains the continuous data

S

S

S

3

D

1

D

2

D

1

2

3

Fig. 1. Wavelet transforms process Wavelet transform is a series transformation process that addresses the origin data pattern. It prune away the extremely data and smoothen the pattern. As shown in Fig 2, F (t ) represents the origin data, S1 represents approximation signal and D1 represents detailed signal. S1 presents the data pattern smoothly by 1 level wavelet, so it called approximation signal. D1 presents the pattern that help to distinguish the extremely data. It is called detail signal and when the data pattern is rough, 2-4 level of wavelet process will be repeated.

3.2. Development of a Wavelet and TSK fuzzy rule based systems In this paper, a wavelet is applied as the data preprocessing tool to eliminate the noise within a financial time series data and then a hybrid model by integrating the wavelet and TSK fuzzy rule based systems is applied for future stock price predication. The main motivation of this study is to propose a new data mining approach for exploring stock market tendency and to test the predictability of the proposed hybrid model by comparing it with conventional models and neural network models.

3.2.1. Training process. Most of the previous researchers have employed multivariate input. Several studies have examined the cross-sectional relationship between stock index and macroeconomic variables. Technical indexes are calculated from the variation of stock price, trading volumes and time following a set of formula to reflect the

Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007) 0-7695-2874-0/07 $25.00 © 2007

current tendency of the stock price fluctuations. These indexes can be applied for decision making in evaluating the phenomena of oversold or overbought in the stock market. Basically, the potential macroeconomic input variables which are used by the forecasting models include KD 、 RSI 、 MACD 、 MA 、 BIAS according to Chang et al. (2004).

In this equation, (k) means number k input-output set, k=1......N, N is the total set for input and output.

8000

4000

0

Four steps of training process are explained as below:

1

Step1 Data Preprocessing:

8000

In this step, Wavelet and Fourier Transform have been applied to preprocess Data from Taiwan stock exchange index. In Figure 2, the 495 stock price data from Taiwan Stock Exchange Index are shown before and after the first Level wavelet process including approximation and detailed signals. The approximation signal shows the trend of stock price data while detailed signal on the other hand shows the value margin and time tendency. If pattern value is positive, the data pattern is moving upward and vice versa when the pattern value is negative. Through this process, market investors can have a clear picture about the trend or movement of the data.

4000

Step2 Choosing the Forecasting factor:

4000

101

201

301

401

300 200

Stepwise Regression model is applied with F value and sum square error to evaluate the significant factors. The detail procedures of this method are described as follows: 1. Choose the critical point and get every critical point Level of Significant; 2. Set F-enter and F- remove value; consider every F value, if F is greater than or equal to Fenter, then join in the factor. If F is less than F-remove, then remove the factor. Step3 Classify the training data by K-Means K-Means method is applied to cluster data. The data will be clustered into sub-class. The data in each sub-calss will be more homogenous and each sub-class will be identified as a single rule applied in the TSK fuzzy rule base. Through the clustering process, the high dimensional problem can be efficiently resolved while the accuracy of the rule model still can be effectively reserved. However, the number of clusters within each data still needs to be further identified through a series of experiments.

100 0 -100 1

0 1

101

201

301

401

101

201

301

401

101

201

301

401

301

401

-200

8000

300 200 100

4000

0 -100 1

0 1

101

201

301

-200

401

300

8000

200 100 0 -100 1

0 1

101

201

301

401

101

201

-200

Fig. 2. Stock Price Data before and after the Wavelet Transform Suppose there are two fuzzy rules in rule base, and one test case ( x1 , x 2 ) , first calculate ( x1 , x 2 ) in different values of Membership Degree (md). In this research the Gauss Membership function is used:

⎡ (x − a ) ⎤ ⎥ σ ⎣ ⎦ 2

md j = exp ⎢ − i

j

ij

2

(5)

ij

In these equation,

aij and σ ij is number I group’s j input

values mean and standard deviation and the md for

( x1 , x 2 ) can describe like below: ( x1 , x 2 ) ⇒ [ x1 ( 0 .7 in A11 ) , x 2 ( 0 .8 in A12 )] ⋅ ⋅ ⋅ ⋅ ⋅ Rule 1

Step4 Training a fuzzy ruled base

( x1 , x 2 ) ⇒ [ x1 (0.8 in A21 ) , x 2 (0.6 in A22 )] ⋅ ⋅ ⋅ ⋅ Rule 2

Assume x1 and x 2 are input values and y is output value, than the set of combination can be formulated as below: ( x( ) , x( ) ; y ( ) ) , ( x( ) , x( ) ; y ( ) ) , ( x ( ) , x ( ) ; y ( ) ) , ....

Than next step is use md ’s product, calculate the test case in this fuzzy rules method presents in degree.

1

1

(x

(k)

1

1

1

2

2

(k )

,x ;y 2

1

(k)

),

2

2

2

3

1

3

3

2

(4)

i

(Simply present D ( Rule) : D ).At least find higher D value forecasting, this equation can be written as below.

Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007) 0-7695-2874-0/07 $25.00 © 2007

i

D i = ∏ j =1,...n md ij

(6)

In this equation, D i : i rules membership degree, n means number of input data. When determine a series of measurement performance. Next step will consider a better approach to improve the forecasting efficiency. In this research, we combine K-means, Wavelet, and TSK fuzzy system for stock price forecasting,, the time complexity of each method are described as follows: The time complexity for K-MEAN is O(n2) times, and Wavelet and TSK fuzzy times are O(n2+m4) times, respectively. Therefore, the time complexity of the proposed system is O(n2+m4) times.

4. Experimental Result

figure, we see that the best performance has been achieved with 3 clusters. As for the convergence of the learning process of the rule consequence parameters using the SA is shown in Fig.3. Finally, the MAPE of the forecasting model drops to 3.8%. 0.050 0.045 0.040 0.035 0.030 2

3

4

5

6

7

8

Fig. 3. The number of clusters and the accuracy derived from each clustered model 25

Table 1.Technical Indices. Technical index

Explanation

Moving averages are used to emphasize the direction of a trend and smooth out price and volume fluctuations that can confuse interpretation. The difference between the closing value and moving average line, which uses the stock six days bias(BIAS6) price nature of returning back to average price to analyze the stock market. nine days Stochastic line(K9, The stochastic line K and line D are used to determine the signals of overD9) purchasing, over-selling, or deviation.

six days Moving Average(MA6)

MACD shows the difference between a fast

Moving Average Convergence and slow exponential moving average (EMA) and Divergence(MACD9) of closing prices. Fast means a short-period average, and slow means a long period one.

The proposed model is compared with two famous forecasting methods, i.e., GANN and GAWM and these two models can be further referred in [9] [10] In the k-means clustering algorithm, the number of clusters must be pre-defined. To check the sensitivity of the performance our model on the number of clusters, various numbers of data clusters are investigated. In the experiment, the stock price data was clustered into 2 to 8 clusters. The performance (MAPE) of the algorithms with different number of clusters is shown in Fig. 3. From the

20 MAPE

In this research, Matlab 7.0 software is applied to preprocess the stock closing price data transformed by Haar Wavelet and C++ Builder software is applied to develop K-means for data clustering and T.S.K. fuzzy rule based system. As for the computing equipment, it is a Pentium Ⅳ 2.4G computer with 512 KB memory. To derive effective input factors, SPSS statistic software is used for stepwise regression. Throughout the process, this study choose five important factors including MA6, BIAS6, K9, D9, and MACD9 as shown in Table 1

15 10 5 0

1

9 17 25 33 41 49 57 65 73 81

Fig. 4. S.A Training astringent figure Table 2. Comparisons of different forecasting models Wavelet and TSK model

GANN model

K-GAWM model

Best Mape

0.759 %

1.85 %

3.09%

Ave Mape

0.815 %

2.24 %

3.54%

Execution Time(ave)

7.56sec

20.21sec

12.56sec

From the above table, it can be seen that the wavelet and T.K.S. model is able to obtain much better solutions than GANN and K-GAWM. The average MAPE values are quite small, which means that the wavelet and T.S.K. model shows very good forecasting capability, and consider the program execution time, the wavelet and T.S.K. model also shows their superior than other two methods. We therefore conclude that the forecasting ability of these three forecasting method, which demonstrates the effectiveness of the wavelet combine with T.K.S fuzzy system.

5. Conclusions Wavelet theory and T.S.K. fuzzy systems were very powerful and useful tool which have been widely applied in every different aspects of continuous optimization systems. The reasons for a better performance of our proposed system are: First of all, the wavelet process can further reduce the noise of the data. Secondly, the K-

Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007) 0-7695-2874-0/07 $25.00 © 2007

mean algorithm is applied to cluster the data into subclass with more homogenous properties. The TSK fuzzy system can generate a more efficient rule base according to these clusters thus to solve the high dimension problem of the data. However, we could seldom examine the combination used of these two methods especially in forecasting the time series data. According to the literature review, Neural Network, Heuristic model and regression model are frequently used to solve time series forecasting issues. In our research, yet the effectiveness of the hybrid method of integrating wavelet and T.S.K. fuzzy system are more efficient and accurate than traditional forecasting method. As shown in Table 2, the wavelet and T.S.K forecasting method produced a better result than GANN and K-GAWM model. The integration of these two methods is a pioneer work in the stock price forecast.

Reference [1]. [2]. [3].

[4]. [5].

[6]. [7].

[8].

[9].

Abu-Mostafa Y.S., Atiya, AF. “Introduction to financial forecasting.” Applied Intelligence, 6 ,1996,:pp205–13. Abraham, A., N. Baikunth, and P.K. Mahanti “Hybrid Intelligent Systems for Stock Market Analysis.” Lecture Notes in Computer Science, 2074 ,2001.,pp 337-345, Abraham, A., Philip, N. S., and Saratchandran, P.. “Modeling Chaotic Behavior of Stock Indices Using Intelligent Paradigms.” Neural, Parallel and Scientific Computations, 11, 2003,143-160. Aiken, M. and M. Bsat. “Forecasting Market Trends with Neural Networks.” Information Systems Management 16 (4), 1999,42-48. Baba, N., N. Inoue and H. Asakawa. “Utilization of Neural Networks & GAs for Constructing Reliable Decision Support Systems to Deal Stocks.” IEEE-INNSENNS International Joint Conference on Neural Networks (IJCNN'00), (5), 2000 ,pp 5111 -5116. Brownstone, D. “Using Percentage Accuracy to Measure Neural Network Predictions in Stock Market Movements.” Neurocomputing (10),1996,pp 237-250. Chang, P.C., Wang, Y. W. and W. N. Yang. “An Investigation of the Hybrid Forecasting Models for Stock Price Variation in Taiwan.” Journal of the Chinese Institute of Industrial Engineering, 21(4), 2004, pp.358368. Chang, P.C. and T. Warren Liao. “Combing SOM and Fuzzy Rule Base for Flow Time Prediction in Semiconductor Manufacturing Factory.” Applied Soft Computing, 6(2) ,2006a,pp 198-206. Chang, P.C. and Y.W. Wang. “Fuzzy Delphi and BackPropagation Model for sales forecasting in PCB Industry” Expert Systems with Applications, 30(4),2006b, 715726..

[10]. Chang, P.C., C.H. Liu and Y.W. Wang. “A Hybrid Model by Clustering and Evolving Fuzzy Rules for Sale Forecasting in Printed Circuit Board Industry” Decision Support Systems, on-line available, 2005,December 20. [11]. Chen, A.S., Leung, M.T., and Daouk, H. Application of Neural Networks to an Emerging Financial Market: Forecasting and Trading the Taiwan Stock Index. Computers and Operations Research, 30, (2003). 901923. [12]. Chen, M. Y., and D. A. Linkens. Rule-base selfgeneration and simplification for data-driven fuzzy models. Fuzzy Sets and Systems, 142, (2004) 243–265. [13]. Chi, S. C., H. P. Chen, and C. H. Cheng. A Forecasting Approach for Stock Index Future Using Grey Theory and Neural Networks. IEEE International Joint Conference on Neural Networks, (1999),3850-3855. [14]. Izumi, K. and Ueda, K.. Analysis of Exchange Rate Scenarios Using an Artificial Market Approach. Proceeding of the International Conference on Artificial Intelligence, 2, (1999) ,360-366. [15]. Kim, K. J., and I. Han. Genetic Algorithms Approach to Feature Discretization in Artificial Neural Networks for The Prediction of Stock Price Index. Expert Systems with Applications, 19, (2000),125-132. [16]. Kimoto, T., and K. Asakawa Stock market prediction system with modular neural network. IEEE International Joint Conference on Neural Network, (1990).1-6. [17]. Lee, J. W. Stock Price Prediction Using Reinforcement Learning. IEEE International Joint Conference on Neural Networks, (2001). 690-695. [18]. Quah, T.S. and Srinivasan, B. Improving Returns on Stock Investment through Neural Network Selection.. Expert Systems with Applications,17, (1999). 295-301. [19]. Takagi, T. and M. Sugeno. Fuzzy Identification of Systems and its Application to Modeling and Control. IEEE Transactions on Systems, Man and Cybernetics, 15, (1985)16-132. [20]. Wang, L. X., and J. M. Mendel. Generating Fuzzy rules by Learning form Examples. IEEE Transaction on Systems, Man, and Cybernetics, 22, (1992) 1414-1427. [21]. Tao Li , Qi Li , Shenghuo Zhu , Mitsunori Ogihara, “A survey on wavelet applications in data mining,” ACM SIGKDD Explorations Newsletter, v.4 n.2, 2002 p.4968. [22]. Cheng H.P.and Chuen-Sheng Cheng. “Control Chart Pattern Recognition Using Wavelet Analysis and Neural Networks” ANQ Congress 2006 Singapore

[23]. Chang, P.C., C.H. Liu 2006 “A TSK type Fuzzy Rule Based System for Stock Price Prediction,” Expert Systems with Applications, Available online 25 September 2006.

Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007) 0-7695-2874-0/07 $25.00 © 2007

Suggest Documents