genetic programming in time series modelling: an ... - CiteSeerX

35 downloads 6211 Views 91KB Size Report
Genetic Programming (GP) using a dynamic tree representation .... 1 1 0. ,, θ. + all possible combinations of y(k), u(k) and e(k) up to degree ]. ( ). ( ) ( )....
GENETIC PROGRAMMING IN TIME SERIES MODELLING: AN APPLICATION TO METEOROLOGICAL DATA Katya Rodríguez Vázquez DISCA-IIMAS-UNAM Circuito Escolar, Cd. Universitaria México City, 04510, MEXICO [email protected]

Abstract. This paper describes a genetic programming approach for applications on prediction of real meteorological data. The well-know SISO NARMAX model is used to model and forecast this real time series. The evaluation of candidate models is based on a set of criteria taken into consideration in order to get the simpler and more accurate model for prediction.

1. INTRODUCTION The purpose of this study is to explore the use of genetic programming to the problem of how future values of a time series can be predicted. In previous work, the polynomial NARMAX model has been described and explored to deal with system identification problem. Those works will be the foundation of this study. Based upon the NARMAX (Non-linear AutoRegressive Moving Average with eXogenous inputs) model [Leontaritis and Billings, 1985], which comprises a wide group of different non-linear structures, two evolutionary algorithm schemes have been proposed to solve identification problems: ƒ Genetic Algorithms (GAs) using a subset and variable representation (Fonseca et al., 1995) ƒ Genetic Programming (GP) using a dynamic tree representation (Rodríguez-Vázquez, et al., 1997a) In the first case, the choice of system structure was defined as a subset selection problem. The goal was to find the subset of terms (regressors) from a given total set, which gave the best performance. In the case of GP, system identification was formulated as the inductive discovery of programs (representing a dynamical polynomial model) that, again, possessed the best performance. In contrast, GP proved to be an alternative, which overcome this fact. The GP approach can generated any model structure as complex as the maximum depth size GP parameter allow without increasing the processing time. As mentioned in previous paragraph, Evolutionary Algorithms (GAs and GP) have been used for modelling

non-linear system: single-input single-output (SISO) nonlinear systems that consider the linear model as special case, based upon polynomial and rational forms of the NARMAX function. More recently, the extension into the MIMO form of the NARMAX model was also introduced in the GP approach (Rodríguez-Vázquez, 2000). However, it was only tested on simulated data. From these ideas on modelling and identification, the modelling and forecasting problem is treated in this paper. This is structured as follows. Section 2 describes the SISO and its expansion into the MIMO form of the NARMAX model. In section 3, the performance measures to be used in the prediction of future values of the meteorological time series are detailed. Next section, section 4, gives information about the meteorological data. In section 5, the prediction task is performed and comparison between measured and predicted values are analysed. Finally, conclusions are drawn.

2. SISO AND MIMO NARMAX MODELS The model proposed by Leontaritis and Billings (1985) describes the Single-Input Single-Output (SISO) NARMAX system y (k ) = F " { y (k − 1),  , y k − n y , u (k − 1),  , u (k − n u )

(

)

e(k − 1),  , e(k − n e ) + e(k )

(1)

where y(k), u(k) and e(k) are the system output, input and noise sequences, respectively; ny, nu and ne are the maximum lags in the output, input and noise; {e(k)} is assumed to be a white sequence and F"(•) is some nonlinear function.

2.1. Polynomial Representation The NARMAX model is the most general form of inputoutput model and can be expressed in different ways. Chen and Billings (1989) have shown that the polynomial NARMAX model is the most common expression which works well in practical applications. Equation (1) can be written in polynomial form as

ny ny nu  ny y (k ) = θ 0 + ∑θ i y (k − i ) + ∑θ n y +i u (k − i ) + ∑ ∑θ i , j y (k − i )y (k − j ) i =1 i =1 i =1 j =1 n y nu

nu nu

i =1 j =1

i =1 j =1

+ ∑ ∑θ i , n y + j y (k − i )u (k − j ) + ∑ ∑θ n y + i , n y + j u (k − i )u (k − j )

+ higher order terms up to degree " ] nu ne  n y ne + ∑ ∑θ i , 0, j y (k − i )e(k − j ) + ∑ ∑θ 0,i ,ne + j u (k − i )e(k − j ) i =1 j =0 i =1 j =0 n y nu ne

+ ∑ ∑ ∑θ i , j ,l y (k − i )u (k − j )e(k − l ) + all possible combinations of y(k), u(k) and e(k) up to degree " ] ne ne  ne +  ∑ θ i e(k − i ) + ∑ ∑θ i , j e(k − i )e(k − j ) i =0 j =0 i =0 + higher order terms up to degree " ]

(2)

which can be expressed as y (k ) = Ψ T yu (k − 1)θ yu + Ψ T yue (k )θ yue + Ψ T e (k )θ e

(3)

where ΨTyu(k-1) includes the constant term and all the output and input terms as well as all possible combinations up to degree ". These terms will be referred to as process terms. The parameters of such terms are in the vector θyu. The other vectors of monomials are defined likewise. ΨTyue(k) and ΨTe(k) will be referred to as noise terms. However, because the noise e(k) is unknown, equation (3) can be rewritten in the prediction error (PE) form as yˆ (k ) = Ψ T yu (k − 1)θˆ yu + Ψ T yuε (k )θˆ yuε + Ψ T ε (k )θ ε + ε (k ) (4)

where the residual ε(k) is defined as ε (k ) ≅ y (k ) − yˆ (k ,θˆ )

(5)

In a simplified form, equation (4) is written as y (k ) = Ψ T (k − 1)θˆ + ε (k )

(6)

where Ψ T (k − 1) = Ψ T yu (k − 1) Ψ T yuε (k − 1) Ψ T ε (k − 1)

[

]

[

θˆ T = θˆ T yu θˆ T yuε

θˆ T ε

2.2. MIMO NARMAX Model Equation (1) is the basis for the identification of MIMO non-linear systems. Expanding equation (1) in its MIMO form gives

, y (k − n ), u (k − 1),, u (k − n ), e (k − 1),  , e (k − n )+ e (k ) (10)

y i (k ) = Fi" { y i (k − 1),

i =1 j =1l =0

and

(5) to obtain the residuals. Once the residuals are computed, the noise terms are incorporated into the matrix Ψ(k-1) and a new set of parameters is estimated. This process is repeated until the residuals converge or a predetermined number of iterations is achieved.

]

Equation (6) belongs to the linear regression model y (k ) = Pi (k )θˆ + ε (k )

(7)

(8) (9)

Then, Pi(k) consists of all possible linear output, input and noise terms, and all possible non-linear terms in the output, input, noise and combined terms. The polynomial model is then non-linear in the output, input and noise but linear in the parameters. This set of coefficients is estimated by means of an extended Least Squares (LS) algorithm (Billings and Voon, 1984). This method consists of estimating the process terms first and then using equation

i

i

i

yi

ei

j

j

uj

i

where i=1, ..., m and j=1, ..., r, indicate an R-inputs Moutputs non-linear system. Hence, in the model structure of equation (10), the maximum lags for each sub-system may be assigned to different values. The non-linear form of F"i(•) is generally unknown. One of the most common and practical forms of the NARMAX model is the polynomial expansion where the parameter estimation is a linear process. Expanding equation (10) as a polynomial of degree "i gives the representation ni

y i (k ) = ∑ θ ij X ij (k ) + ei (k ),

i = 1, , m

j =0

(11)

where "i

n i = ∑ n ij ,

ni0 = 1

j =0

(

)

r m  i i n ij −1  ∑ n iyk + n ek + ∑ n uk + j − 1 k =1  k =1  , j

(12)

j = 1, , " i

(13)

and the xij(k) are monomials of degree up to "i, each one consisting of delayed outputs and inputs (process terms) and terms which also involve noise (noise terms). Genetic Programming is then use as an alternative to determine the model structure of MIMO systems, and the previous work on system identification are the foundation of present research on modelling MIMO systems.

3. METRICS Two predictive error measures are defined as the objectives regarding the model performance. The first metric is the residual variance which is calculated as

σ ε2 = where

1 N

(

N



yˆ (k ) − y (k )

2

(14)

k

)

yˆ (k ) = f " ( y (k − 1),  , y k − n y , u (k − 1),  , u (k − n u ), e(k − 1),  , e(k − n e ))

(15)

The long-term prediction error (LTPE) is computed by using equation (15) where the estimated output yˆ (k ) is defined as

)

e(k − 1),  , e(k − n e ))

(16)

80 Humidity

(

yˆ (k ) = f " ( yˆ (k − 1),  , yˆ k − n y , u (k − 1),  , u (k − n u ),

Measured Relative Humidity 100

In these equations, yˆ (k ) denotes the predicted system output.

60 40 20 0

0

1

2

3

4

5

6

7

5

6

7

Measured Temperature

4. METEOROLOGICAL TIME SERIES

30

Temperature

25

In order to the test the GP-NARMAX modelling and forecasting approach on a real application, a meteorological data set was taken. This data set, regarding to the local behaviour of temperature in a large period of time, was measured near Mexico City, at Texcoco Lake. The time interval was 15 minutes, and for practical purpose, Table I shows a small fragment of the total file consisting of 37 days, approximately. The columns correspond to the date, time, temperature (TA200), relative humidity (RH200), wind velocity (WV200), wind direction (WD200) and ground radiation (GR200), respectively. In this study, temperature is the variable to be predicted.

20 15 10 5 0 −5

0

1

2

3

4 Time (days)

Figure 1. A sample of 7 days of the measured variables Temperature and Relative Humidity, respectively.

5. PREDICTION 5.1. Model GP Representation

TABLE I. Recorded Data at Texcoco Lake, Mexico City. TIME TA200 RH200 WV200 WD250 GR200 DATE 10/01/00 0945 12.5 58 1 219 411.0 10/01/00 1000 13.3 56 2 277 455.0 10/01/00 1015 14.1 55 3 263 494.0 10/01/00 1030 14.5 53 2 274 532.0 10/01/00 1045 15.3 50 2 247 570.0 10/01/00 1100 16.5 47 2 226 598.0 10/01/00 1115 17.3 45 2 253 614.0 10/01/00 1130 18.0 43 2 265 647.0 10/01/00 1145 18.4 40 4 250 669.0 10/01/00 1200 19.2 36 2 232 686.0 10/01/00 1215 19.6 34 3 230 680.0 10/01/00 1230 19.6 34 4 219 702.0 10/01/00 1245 21.2 30 1 288 702.0 10/01/00 1300 21.2 30 4 232 691.0 10/01/00 1315 21.6 29 2 254 675.0 10/01/00 1330 21.2 30 3 333 669.0 10/01/00 1345 22.4 29 3 328 642.0 10/01/00 1400 22.4 28 1 263 625.0 10/01/00 1415 22.7 27 3 243 625.0 10/01/00 1430 22.7 25 4 316 592.0 ... ... ... ... ... ... ...

In previous work (Rodríguez-Vázquez and Fleming, 1998), a population of non-linear polynomial structures, expressed as hierarchical trees, has been evolved in order to determine the correct model of the system under investigation, or a least, a model which is able to describe the dynamics of the non-linear system. In this case, the idea is to obtain a model that reproduce and can be able to predict future values of the series. Based upon equation (1), the terminal set and the set of function appropriate to this problem are,

T = {Xo, ..., Xny, Xny+1, ..., Xny+nu, Xny+nu+1, ..., Xny+nu+ne} = {c, y(k-1), ..., y(k-ny), u(k-1), ..., u(k-nu),e(k-1), ..., e(k-ne)} and

F = {ADD, MULT} = {+, *} respectively. Due to the fact that the polynomial model is the one used in this work, only two functions are required: addition of a term in the model (ADD operation) and increment of nonlinearity degree (MULT operation).

5.2. Results Analysis In order to be more explicit about the recorded data, Figure 1 shows a window of 7 days. As mentioned above, data were gathered every 15 minutes. In this Figure, the variables Temperature and Relative Humidity are displayed.

Computational settings for this problem are summarised in Table II. Here, a simple crossover is performed, which allows crossing over two trees at any points. In the instance of mutation, a terminal node can be exchanged by either another terminal or a new generated sub-expression. If a function node is selected for mutation, this node and its associated sub-expression nodes are eliminated and

where TA is the Temperature, RH the Relative Humidity and the variable e is used to model the noise introduced when gathering the data. The function set is defined as described previously. Table II. Computational settings for GP in Time Series Modelling.

Population Size Maximum Generation Crossover Frequency Mutation Frequency Termination Criterion Fitness Function

Note that equation (17) only considers linear and second non-linearity degree terms and the last measured value of temperature (TA(k-1)) and the temperature measured an hour before (TA(k-4)1).

Predicted Temperature 30

25

20

Temperature (C)

substituted by either a single terminal or a new generated sub-expression. By means of this simple implementation, in a first experiment, Temperature is the variable to be predicted. It is defined that the estimation of future behaviour of this variable depends on the Relative Humidity and previous values of temperature. Thus, the terminal set is defined as, T = {c, TA(k-1), ..., TA(k-nTA), RH(k-1), ..., RH(k-nRH), e(k-1), ...., e(k-ne)}

200 200 0.9 0.05 Maximum Generation Eq. (15) and (16)

15

10

5

0

−5

0

0.5

1

1.5

2 Time (days)

2.5

3

3.5

4

Figure 2. Predicted temperature by means of equation (17).

In evaluating the population, equation (15) is firstly performed and the coefficients of the model equation estimated by means of a Least Squares algorithm. Once the structure (evolved by means of GP) and its associated coefficients are determined, equation (16) is applied in order to get a measured of the quality of prediction of future values. By this process and after the termination criterion is reached, a mathematical expression able to model the behaviour of temperature and predict future events was obtained. This structure consists of the following terms and coefficients, TA(k ) = −0.1554 RH (k − 1) + .4797TA( K − 4) + 0.0229 RH (k − 1) 0.0038TA(k − 1) RH (k − 5) _ 0.0025TA( k − 1) RH ( k − 3) 0.0354TA(k − 1)e(k − 1) + 0.0054RH (k − 3)e(k − 1) − 1.7825e(k − 1) + 10.2790

(17)

Predicted Error 5

4

3

2

1

0

−1

−2

−3

−4

−5

The predicted (based on equation 17 and taking the previous predicted value of Temperature to get the next one) and the measured Temperature are displayed in Figure 2. The solid line corresponds to the measured variable and the dashed line to the predicted temperature. From this Figure, it is seen that prediction at lower Temperature shows a higher difference in comparison with the measured value. Out of this range, the prediction improves and it can be corroborated at the end of day 4 (3.5 to 4 from Figure 2) when the low temperatures of this day have increased in comparison with previous days. The predicted error is shown in Figure 3 in order to be more specific about the ranges where this error is higher.

0

0.5

1

1.5

2 Time (days)

2.5

3

3.5

4

Figure 3. Predicted error = Measured – Predicted value.

5.2.1.

MISO Approach

This section introduces the concept of multiple-input single-output NARMAX model. From Table I, it is seen that the behaviour of temperature can depend on several factor such as wind, ground radiation, etc.

1

This is by considering that the samples are gathered every 15 minutes

From equation (10), it is seen that each sub-system is a function of the all input, output and noise data sequences involved into the MIMO system. However, there is only one system to be considered in this case. Thus, this equation will consist of R-inputs 1-Output system. Here, the terminal set is extended and it is then composed of all linear terms in the output, inputs and noise signals. That is, T = {c, y(k-1), . . ., y(k-ny), uj(k-1), . . ., uj(k-nuj), . . ., e(k-1), e(k-ne)} Then, j = 1,..., 4 for relative humidity (RH), wind velocity (WV), wind direction (WD) and ground radiation (GR), respectively. The function set consists of the same elements as previous experiment. Based on this MISO model, the GP-NARMAX approach was applied considering the same GP parameters as described in Table II and taking the terminal set as described above. It is interesting to point out that the model obtained in this experiment, even though four input variables were considered, included only two vinput variables and previous values of the output in the final mathematical expression. Thus, this expression was a function of previous output (TA) values, relative humidity (RH) and ground radiation (GR). The predicted error is graphically represented in Figure 4. Note that it is lower than the case of SISO model (first experiment).

gives a flexible modelling tool. Additionally, genetic programming has also shown to deal with these structures in an easy way. In the context of the application, a mathematical expression consisting of nine terms and no more than a second non-linearity degree was obtained in the first experiment. This model was used to predicted the future behaviour of temperature at the region near Mexico City. The results showed the applicability of this approach in this kind of modelling problem. Due to the fact that several variables were measured, it was possible to see how diverse factors such as relative humidity and ground radiation would affect the temperature. In this case, a MISO model was considered. It is also seen that data displayed in Table I, do not include information about rainfall. This fact was due to the data was collected during January and February, which are not the rain season. Rain factor would also affect not only the temperature but also the relative humidity. Hence, a MIMO non-linear model as described by equations (10) and (11) could be introduced in order to model this meteorological system. As described in previous work (Rodríguez-Vázquez, 2000), MIMO system can be represented as a multiple-tree structure as shown in Figure 5, where each tree represents each sub-system.

Predicted Error 5

4

3

2

1

0

Figure 5. Multiple-tree individual representation for MIMO model system identification.

−1

−2

−3

ACKNOWLEDGEMENTS

−4

−5

0

0.5

1

1.5

2 Time (days)

2.5

3

3.5

4

Figure 4. Predicted error = Measured – Predicted value.

The author gratefully acknowledges the financial support of Consejo Nacional de Ciencia y Tecnología (CONACyT) under the project J34900-A.

6. CONCLUSIONS AND FURTHER WORK

REFERENCES

In this work, it has been presented an application of the GP-NARMAX approach on the prediction of meteorological data. The model representation used in this work has shown to be robust . The expansion of the SISO form into MISO and after that into the MIMO version

BILLINGS, S.A. AND W.S.F. VOON (1984) Least Square Parameter Estimation Algorithm for Non-Linear Systems. Int. J. Systems Sci., 15(6), pp. 601-615.

CHEN, S. AND S.A. BILLINGS (1989) Representation of Non-Linear Systems: the NARMAX Model. Int. J. Control, 49(3), pp. 1013-1032. FONSECA, C.M., E.M. MENDES, P.J. FLEMING AND S.A. BILLINGS (1995) Non-Linear Model Terms Selection with Genetic Algorithms. IEE/IEEE Workshop on Natural Algorithms in Signal Processing, Vol. 2, University of Essex, U.K., pp. 27/1-27/8. LEONTARITIS, I.J. AND S.A. BILLINGS (1985) Input-Output Parametric Models for Non-Linear Systems. Part I and Part II. Int. J. Control, 41(2), pp. 304-344. RODRÍGUEZ-VÁZQUEZ, K., C.M. FONSECA AND P.J. FLEMING (1997b) Multi-Objective Genetic Programming: A Non-linear System Identification Application. Late Breaking Paper at the GP’97 Conference, pp. 207-212. RODRÍGUEZ-VÁZQUEZ, K. AND P.J. FLEMING (1998) MultiObjective Genetic Programming for a Gas Turbine Engine Model Identification. In: International Conference on Control, UKACC Control’98, pp. 1385-1390. RODRÍGUEZ-VÁZQUEZ, K. (2000) Identification of NonLinear MIMO Systems Using Evolutionary Computation, Genetic and Evolutionary Computation Conference GECCO’2000, Late Breaking Papers, pp. 411-417.

Suggest Documents