Time Series: Empirical Characterization and Artificial Neural Network-based Selection of Forecasting Techniques
Ma. Guadalupe Villarreal Marroquín Graduate Research Assistant División de Posgrado en Ingeniería de Sistemas, FIME-UANL
[email protected]
Mauricio Cabrera-Ríos Associate Professor Corresponding Author División de Posgrado en Ingeniería de Sistemas, FIME-UANL Av. Universidad Universitaria s/n, Cd. Universitaria, 66450 Apartado Postal 076 Sucursal F San Nicolás de los Garza, Nuevo León, MÉXICO Tel. +52(81)1492 0367 Fax +52(81)1052 3321
[email protected]
1
Time Series: Empirical Characterization and Artificial Neural Network-based Selection of Forecasting Techniques Ma. Guadalupe Villarreal Marroquín and Mauricio Cabrera-Ríos* División de Posgrado en Ingeniería de Sistemas, FIME-UANL San Nicolás de los Garza, Nuevo León, MÉXICO {
[email protected],
[email protected]} *Corresponding Author
Abstract In this work a method to facilitate the elaboration of a forecast for people with little statistical training is proposed. The method uses a rather simple yet sufficiently accurate time series characterization that allowed training a series of artificial neural networks (ANNs) to predict the forecasting performance of several statistical techniques. A case study is presented to demonstrate the application of the method. All techniques used, including the ANN, were conveniently coded in MS Excel so the computational requirements are modest. Furthermore, the results can be tabulated for quick consultation.
Key Words Artificial Neural Networks, Forecasting Techniques, Time Series Characterization
1. Introduction Our daily life is full of decisions. Everyday decisions greatly depend on the information at hand as well as our best estimation of future information. Forecasting techniques deal with generating the latter. In many cases, historic records about the behavior of a phenomenon are the bases for generating estimates about its future behavior. Such historic evolution organized chronologically is commonly known as a time series. When it comes to time series forecasting, it is reasonable to expect that better decisions be made based on better forecasts. Although many people are required to generate forecasts in a daily basis, three issues frequently make their task difficult: (1) not having the proper statistical background, (2) not having specialized forecasting software
2
available, and (3) not having the time to compare several methods to find the best-suited to the time series under analysis. These three issues provide the motivation for this work. Three phases were necessary to get to the results here presented. The first phase consisted on exploring and coding eight traditional time series forecasting techniques to be able to measure their forecasting performance. The second phase dealt with developing a simple, yet sufficiently accurate time series characterization. The third phase included training artificial neural networks (ANNs) with the statistical methods and their performance on a set of sample time series. The proposed method does not depend on highly specialized statistical software and allows selecting an adequate forecasting technique in a consistent manner through a simple-to-obtain characterization. The ANN and the forecasting techniques were conveniently coded in MS Excel, and are therefore easy to distribute. Furthermore, the results can be tabulated for quick consultation. A potential impact of this work is that, through the application of the proposed method, people with little or no statistical training could select competitive forecasting techniques for their series, thereby improving their output and their decision-making. The performance of the method is assessed with statistical basis and its application is demonstrated here through a case study.
2. Literature Review Time series forecasting has been used in many real cases as a decision-making aid. Although many techniques are available for forecasting, the variation that can be found in time series does not allow having a universally best technique. Selection of an adequate technique is considered a non-trivial task that requires experience and statistical knowledge (Prudencio et al., 2004). In the literature, using ANNs as time-series forecasters is the main topic of several works . For instance, some works use ANNs to predict the quality of air (Gardner and Dorling, 1999; Kolehmainen et al., 2001; Niska et al., 2004); another one (Kuo et al., 1999) compares the performance of ANNs and the autoregressive mean average in forecasting the demand in a chain of Chinese supermarkets. However, recently an increased focus on using ANNs to select forecasting techniques using the characterization of a time series as input is evident (Venkatachalam et al., 1999).
3
Several works have concluded that there is a strong dependence of forecasting performance on the characteristics of the time series under study (Venkatachalam et al., 1999; Meande et al., 2000; Prudencio et al., 2004; Inouea et al., 2006). Obtaining such characteristics in practically all reviewed published works requires, however, formal training in statistics. Many people who could benefit from better forecasts, nevertheless, do not have enough or completely lack such background. The method proposed in this work provides a time series characterization that is simple to obtain and that does not require a formal statistics background. It also provides ready-to-use ANNs to predict, based on such characterization, the performance of several well-known statistical forecasting techniques to simply pick the winner and apply it. The details of the method are described in the ensuing section.
3. Proposed Method Figure 1 schematically shows the proposed method. The technique selection part of this method is based on the ideas presented by (Gupta et al., 2000) in the area of job scheduling in manufacturing systems. The objective of the proposed method is to determine the most adequate forecasting technique for a particular time series from a set of known techniques based on two easy-to-obtain series parameters. The proposed method capitalizes on the ANNs’ prediction capabilities to estimate the forecasting mean square error (MSE) of several statistical techniques. ANNs have been mathematically proven to be universal approximators for analytical functions (Hornik et al., 1989) and the feasibility of using MSE as a measure of forecasting error is documented in (Hillier y Lieberman, 2001; Inouea et al., 2006).
The steps of the method, starting with a given time series to be analyzed, are as follows:
1.
Normalize the data to fall in the range [-1, 1]. This step is necessary to avoid dimensionality effects.
4
2.
Characterize the time series. Two parameters must be determined: the number of periods in the time series (t) and the degree of the first polynomial (n) that fits the data with a coefficient of determination, R2 ≥ ε. Both parameters, t and n, are easily obtainable. Parameter t is a direct count of the data points in the series and n results from a sequence of polynomial fit trials in increasing order of degree. Both can be retrieved from a simple scatter plot data vs. time period in Excel (Figure 2). In order to find n, it is only necessary to use the polynomial trend line tool and select the option to show the equation and the R2 value on screen. The procedure is to start with the lowest degree and keep on increasing it until R2 falls on or above a certain threshold ε (Figure 2). In this work a value of ε = 80% was used. In order to verify if 80% is a reasonable value, we set out to statistically test this value for the particular case of fitting a first order polynomial to a time series with at least 12 points (periods). A hypothesis test for correlation was carried out where the null hypothesis was associated with no linear correlation between the time series data and the period number (ρ = 0), while the alternative hypothesis is associated with having a significant linear relationship (ρ ≠0). The particular details of the test instance included a 12-period time series is t =
(m=12),
R m−2 1− R
2
R2=80%
=
and
a
significance
level
α=0.01.
The
resulting
test
statistic
(0.89) 12 − 2 = 6.29 , which must be compared to a table value tα, m-2 = 2.764. 1 − 0.8
Because t ≥ t α, m-2, we can conclude that there is indeed a significant linear correlation at the chosen level. This same test can be used for different number of periods and R2. A close examination of the test statistic will show that its value will increase with a larger number of periods (m), while tα,
m-2
will
decrease with larger m values as it can be verified through statistical tables. Based on the previous observations, we can conclude that for any number of periods m ≥ 12, with R2=80% and α = 0.01, a significant linear correlation will be found, i.e. t ≥ t
α, m-2.
Thus, it can be said that al least for the cases
where a first-order polynomial fit is attempted, an R2 of 80% will indicate an adequate fit if the series has no less than 12 periods. Time series with this last characteristic, fortunately, are highly frequent in practice. This result is an indication that 80% is a reasonable choice for R2 in our method provided that the number of periods is the series meets the described constraint.
5
As it can be seen from the way the parameters are obtained, no statistical background is necessary to characterize the time series in this manner. This characterization is, indeed, one of the original contributions of this work.
3.
Use the ANNs to predict the performance of the forecasting techniques. Previously-trained ANNs that use n and t as inputs are used to predict the forecasting MSE for the statistical techniques. An ANN, in the context of the proposed method, is essentially a nonlinear mathematical model that contains adjustable parameters known as “weights”. Having a larger number of weights allows the ANN to fit higher degrees of nonlinearity. In order to find these weights, least-squares procedures are usually employed. In the ANN literature, several of these procedures are classified as backpropagation algorithms. Also, finding the weights of an ANN such that the fit to a series of known data is maximized is called training.
The original backpropagation algorithm was developed in 1974 by Paul Werbos; nevertheless, its potential was not immediately recognized by the researchers in ANNs. It was until 1986 when the backpropagation algorithm was finally used by Rumelhart et al., (1986) to develop ANNs to be applied in time series forecasting. ANNs have been gaining ground year by year since then (Zhang et al., 1998). One recent attempt to find several ANN parameters simultaneously in forecasting can be found in (Salazar et al., 2006).
The ANNs proposed for this stage of the method follow the scheme shown in Figure 3. The ANN shown has three neuron layers: one input layer that receives the values of t and n, one hidden layer that processes this information, and one output layer that predicts the forecasting MSE for a particular technique.
Referring to Figure 3, vi, j is the weight of the incoming arc into the jth neuron in the hidden layer from the ith neuron in the input layer and wj is the weight of the incoming arc into the output neuron from the jth
6
neuron in the hidden layer. These weights are used to modify the information passing through their associated arc. The weights assigned to the arcs coming from neurons with a preset value of 1 are called biases.
4.
Rank the techniques and pick the winner. Finally, the techniques are ranked according to their predicted forecasting MSE and the one with the lowest MSE value is chosen. Once the method has been applied, the predicted winner is clear and, as we show later, the chosen technique will have a high probability of doing an adequate forecasting job. The method as outlined above uses a set of previously trained ANNs. The following section deals with how these ANNs were built.
4. Obtaining the ANNs to be used in the proposed method The ANNs required for the proposed method must be capable to predict the forecasting performance of each technique. As it was explained previously, in order to build an ANN, a training procedure must be applied. In addition, a validation phase is also necessary to make sure that the model has an adequate prediction capability. A total of 54 time series were generated for ANN training purposes. Each time series was preprocessed as indicated in the first step of the method. The series used in training had t ∈{12, 36, 60} and a polynomial approximation given by n ∈{1, 2, 3, 4, 5, 6}. This results in a total of 3×6=18 combinations of t and n. For each of these combinations, 3 replicates were generated through a stochastic additive perturbation, for a new total of 54 time series. For the validation phase, 22 series were used; 12 of them result from all possible combinations from t ∈{24, 48} and n ∈{1, 2, 3, 4, 5, 6}; 8 of them were series found in forecasting textbooks (Makridakis and Wheelwright, 1998), and the other two are real data time series provided by a local telecommunications company with t∈[12,70] and n∈[1,6]. Eight traditional forecasting techniques were completely coded in MS Excel to carry out the analyses: (1) naïve method, (2) average, (3) moving average, (4) single exponential smoothing, (5) ARIMA (0,1,1), (6)
7
linear regression, (7) double exponential smoothing and (8) ARIMA (0,2,2) (Hillier y Lieberman, 2001; Makridakis y Wheelwright, 1998). ANNs with two and three hidden neurons were trained with promising results in a first experiment. Since in this initial experiment there was a slight trend for the error to decrease as the number of hidden neurons increased, in a follow-up experiment additional ANNs with four to seven hidden neurons were trained. Several authors have studied the choice of an adequate number of neurons in the hidden layer (Zhang, 2004; Hansen et al., 2003; Sexton et al., 2005), since a number of them larger than necessary will result in the loss of the ANN’s prediction capability. This phenomenon is called overtraining. ANN training was approached in this work through the MS excel embedded tool, MS Solver. Because Solver is a local optimizer and the nonlinear optimization problem posed by the training task was experienced to have local minima, multiple starting points were used to increase the probability of finding a low sum of square errors (SSE). The weights were initialized nine times. The first five times, all weights took the same value from the set {-1, -0.5, 0, 0.5, 1}, and the other four they were initialized with random values within -1 and 1. The weights resulting with the lowest SSE computed from a validation set were kept to preserve the ANN’s prediction capability. A lower validation error in this case indicates a higher generalization power. The validation error was used, then, to determine the number of hidden neurons i.e. to choose between 2 or 3 hidden neurons in the firs experiment, and between values ranging from 4 to 7 in the second one. Table 1 shows the values of the validation SSE for each ANN in the first experiment.
Forecasting Technique Naïve Method Average Moving Average ARIMA(0,1,1) Single Exponential Smoothing Linear Regression ARIMA(0,2,2) Double Exponential Smoothing
SSE (two neurons) 0.950 2.965 0.260 0.680 0.924 2.552 1.409 1.501
SSE (three neurons) 0.954 3.028 0.223 0.948 0.954 1.881 1.210 1.245
Table 1. Validation SSE values for ANNs with two and three hidden neurons (First Experiment)
8
Table 1 shows that there is no dominance of a single ANN with regards to SSE across the techniques. This was corroborated through a hypothesis test for two population means where the null hypothesis implies that there is no difference between the means (h0 : µ1 = µ2) and where the alternative hypothesis implies the existence of a statistical difference between them. A significance level of α = 0.01 was used. When the test statistic, with a resulting value of |t14| = 0.244 was compared to a value from statistical tables t0.005,14 = 2.927, it was clear that the null hypothesis held i.e. there was no statistical difference between the means at the stated level. This conclusion implied that a simpler 2-hidden neurons ANN would give predictions as good as a more complicated 3-hidden neurons ANN. The second experiment included the 2- and 3-hidden neurons ANN from the first experiment and additional ANNs through 7-hidden neurons, as explained previously. Table 2 shows the SSE validation values for this experiment.
Forecasting Technique Naïve Method Average Moving Average ARIMA(0,1,1) Single Exponential Smoothing Linear Regression ARIMA(0,2,2) Double Exponential Smoothing
SSE (2neurons) 0.950 2.965 0.260 0.680 0.924 2.552 1.409 1.501
SSE (3neurons) 0.954 3.028 0.223 0.948 0.954 1.881 1.210 1.245
SSE (4neurons) 0.949 3.017 0.288 0.964 0.866 0.874 0.704 0.882
SSE (5neurons) 0.929 3.448 0.294 0.326 1.055 1.764 1.505 1.111
SSE (6neurons) 0.855 2.961 0.216 0.282 1.086 0.884 1.092 1.180
SSE (7neurons) 0.854 3.028 0.232 0.413 1.045 1.373 1.073 2.228
Table 2. Validation SSE values for ANNs with two through seven hidden neurons (Second Experiment)
Again, in Table 2, it is evident that no particular ANN dominates across all techniques. This time, however, the presence of competitive clusters in the 4- and 6-hidden neuron ANNs, made us choose to use them independently along with the 7-hidden neuron ANN according to the particular technique for which they showed the best performance (numbers in bold style in Table 2). For example, to predict the forecasting MSE for the naïve method, a 7-hidden neuron ANN will be used; for the Average, a 6-hidden neuron ANN, and so forth.
9
The selected ANNs can then be used to predict the MSE of all eight techniques for time series characterized through n and t.
4. Results Validation series were used to demonstrate the use of the ANNs described previously. Because validation series were not used to find the ANN’s weights, the objectivity of the analysis is preserved. Table 3 shows the real MSE for the 22 validation time series each characterized through a particular combination of n and t. Table 4 shows the ANN-predicted MSE for each technique in the first experiment, i.e. using two-hidden neuron ANNs. Table 5 shows the ANN-predicted MSE for each technique using the ANNs chosen in the second experiment.
SERIES
n 1 2 3 4 5 6 1 2 3 4 5 6 1 2 5 1 1 1 4 2 1 4
t 24 24 24 24 24 24 48 48 48 48 48 48 24 24 24 24 24 24 48 48 67 67
Naïve Method 0.05 0.09 0.22 0.20 0.14 0.16 0.07 0.09 0.13 0.10 0.10 0.11 0.01 0.02 0.08 0.13 0.09 0.08 0.01 0.01 0.01 0.03
Average 0.39 0.27 0.48 0.43 0.34 0.36 0.27 0.27 0.30 0.26 0.27 0.23 0.44 0.39 0.41 0.32 0.53 0.38 0.41 0.40 0.40 0.30
FORECASTING TECHNIQUE Single Moving ARIMA Exponential Linear Average (0,1,1) Smoothing Regression 0.10 0.05 0.05 0.04 0.11 0.08 0.08 0.06 0.18 0.16 0.17 0.11 0.18 0.14 0.14 0.10 0.15 0.11 0.11 0.07 0.13 0.13 0.13 0.08 0.07 0.05 0.05 0.05 0.08 0.07 0.07 0.06 0.10 0.09 0.09 0.07 0.09 0.07 0.07 0.06 0.09 0.07 0.07 0.07 0.06 0.07 0.07 0.05 0.09 0.01 0.01 0.01 0.15 0.01 0.02 0.10 0.29 0.20 0.08 0.08 0.13 0.10 0.13 0.06 0.15 0.08 0.09 0.06 0.12 0.07 0.08 0.05 0.14 0.12 0.01 0.01 0.33 0.12 0.01 0.00 0.04 0.05 0.01 0.01 0.07 0.07 0.03 0.03
ARIMA (0,2,2) 0.03 0.07 0.12 0.13 0.09 0.14 0.05 0.05 0.08 0.07 0.09 0.07 0.00 0.00 0.09 0.07 0.07 0.05 0.01 0.00 0.01 0.03
Double Exponential Smoothing 0.04 0.08 0.17 0.13 0.11 0.12 0.05 0.06 0.08 0.08 0.08 0.07 0.00 0.00 0.08 0.17 0.25 0.08 0.01 0.00 0.01 0.03
Table 3. Real MSE for the validation time series.
10
SERIES
n 1 2 3 4 5 6 1 2 3 4 5 6 1 4
t 24 24 24 24 24 24 48 48 48 48 48 48 67 67
Naïve Method 0.09 0.09 0.11 0.15 0.19 0.21 0.08 0.08 0.08 0.08 0.08 0.09 0.09 0.09
Average 0.43 0.42 0.40 0.38 0.35 0.32 0.26 0.26 0.26 0.26 0.26 0.26 0.26 0.26
FORECASTING TECHNIQUE Single Moving ARIMA Exponential Linear Average (0,1,1) Smoothing Regression 0.15 0.10 0.09 0.04 0.15 0.10 0.15 0.06 0.15 0.11 0.16 0.07 0.15 0.12 0.16 0.08 0.15 0.12 0.16 0.11 0.15 0.13 0.15 0.12 0.08 0.07 0.06 0.04 0.08 0.07 0.08 0.06 0.08 0.07 0.09 0.06 0.08 0.07 0.09 0.06 0.08 0.07 0.08 0.06 0.08 0.07 0.08 0.06 0.07 0.06 0.05 0.04 0.07 0.07 0.06 0.06
ARIMA (0,2,2) 0.06 0.07 0.07 0.08 0.08 0.09 0.06 0.07 0.07 0.08 0.08 0.09 0.06 0.07
Double Exponential Smoothing 0.07 0.07 0.08 0.08 0.09 0.10 0.05 0.06 0.06 0.07 0.07 0.08 0.05 0.06
Table 4. MSE predicted for the validation time series using an ANN with two neurons in the hidden layer (first experiment). SERIES
n 1 2 3 4 5 6 1 2 3 4 5 6 1 4
Naïve t Method 0.09 24 0.09 24 0.11 24 0.15 24 0.17 24 0.20 24 0.08 48 0.08 48 0.08 48 0.08 48 0.08 48 0.08 48 0.08 67 0.08 67
Average 0.44 0.42 0.41 0.39 0.36 0.33 0.26 0.26 0.26 0.25 0.25 0.25 0.26 0.26
FORECASTING TECHNIQUE Single Double Moving ARIMA Exponential Linear ARIMA Exponential Average (0,1,1) Smoothing Regression (0,2,2) Smoothing 0.13 0.06 0.09 0.06 0.06 0.05 0.17 0.10 0.15 0.08 0.11 0.07 0.18 0.12 0.16 0.10 0.11 0.09 0.18 0.13 0.16 0.10 0.10 0.09 0.17 0.13 0.15 0.10 0.10 0.09 0.16 0.12 0.15 0.10 0.10 0.08 0.07 0.07 0.06 0.07 0.06 0.04 0.08 0.07 0.08 0.07 0.06 0.06 0.08 0.07 0.09 0.07 0.06 0.06 0.08 0.07 0.09 0.08 0.06 0.06 0.08 0.07 0.09 0.08 0.06 0.06 0.08 0.07 0.09 0.08 0.06 0.06 0.07 0.07 0.07 0.06 0.05 0.05 0.07 0.07 0.07 0.08 0.06 0.06
Table 5. MSE predicted for the validation time series using a specific ANN from Table 2 (Experiment 2).
11
A close examination of Tables 3, 4 and 5 reveals that several methods have the same MSE for different techniques; this is because some of the techniques are particular instances of others. This is the case, for example, for the average and the moving average. In addition, the real MSE values for both methods follow similar patterns. When the predicted MSE is the same in a particular series for different techniques, these are equally ranked, leaving it to the user to choose one. Table 6 shows a summary of the data in tables 3 and 4 and Table 7 does the same for the data in tables 3 and 5. The summaries show the number of instances in which the best technique (based on the real MSE) was predicted as the best one, the second best or the third best.
Real Predicted by the ANNs Best Technique Second Best Technique Third Best Technique Other Total
Best Technique 14 16 12 7 49
Table 6. Number of instances in which the best technique was predicted as the best one, second best and third best by the ANNs in the first experiment.
Real Predicted by the ANNs Best Technique Second Best Technique Third Best Technique Other Total
Best Technique 17 15 10 7 49
Table 7. Number of instances in which the best technique was predicted as the best one, second best and third best by the ANNs in the second experiment.
In tables 6 and 7, it can be seen that in 42 out of 49 cases the ANNs predicted the best forecasting technique among the first three in both experiments. Nevertheless, the ANNs in the second experiment showed a better
12
performance since in 17 cases the best technique was correctly predicted while this happened in 14 cases in the first experiment. A 95% confidence interval for proportions (θ) computed from the previous information is given as 0.751 < θ < 0.962. This means that in 95% of all cases, it can be assured that the proportion of instances in which the actual best technique will be predicted by the ANNs among the best three will fall within 75% to 96%. The reason behind having 49 cases in Tables 6 and 7 when there were only 22 validation series has to do with the tied real MSE values. If a same value for MSE is found for three different techniques, this case is then taken as three different time series instead.
5. Case Study Problem Description In order to further corroborate the previous results, a case study in a small local company is presented here. The company is dedicated to the trade of dry seeds and required to generate forecasts for its most important product for the next three months. A time series was provided as shown in Table 8.
Month 1 2 3 4 5 6 7 8 9 10 11 12
Nut Core (kilograms) 4000 2985 2540 3055 5155 4400 3880 2070 3050 10800 24100 3000
Table 8 . Previous monthly nut core sales.
Application of the proposed method to select the forecasting technique
13
The first step was to normalize the series to fall within -1 and 1, to then find the n and t parameters as shown in Figure 4. The characterization for the series, using R2=80%, was n=6 and t=12. The MSE predicted by the ANNs given these values of n and t are shown in Table 9. Forecasting Technique Naïve Method Average Moving averages Single Exponential Smoothing ARIMA (0, 1,1) Linear Regression ARIMA (0, 2, 2) Double Exponential Smoothing
Predicted MSE 0.3277 0.4372 0.3569 0.2282 0.2399 0.1107 (First) 0.1820 (Second) 0.2035 (Third)
Table 9. MSE predicted by the ANNs for a time series with n=6 y t=12 Table 9 shows that the best three predicted techniques are linear regression, ARIMA (0,2,2) and Double Exponential Smoothing, in that order, for series with 12 periods and a polynomial approximation of sixth order. Using the results from the confidence interval, it is a sensible choice to try the first three techniques and compare which one does a better forecasting job. Table 10 shows the results in terms of the real MSE, where the best three predicted techniques matched the best three techniques.
Forecasting Technique Naive Method Average Moving averages Single Exponential Smoothing ARIMA (0, 1,1) Linear Regression ARIMA (0, 2, 2) Double Exponential Smoothing
Real MSE for the Nut Core time series 0.5194 0.3513 0.5899 0.3457(Third) 0.3509 0.2297(First) 0.3373(Second) 0.3457(Third)
Table 10. MSE obtained from applying each forecasting technique to the Nut Core time series
14
The forecasts for the series are presented using Linear Regression, were (10913, 11707, 12501) kilograms of nut core for the next three months respectively.
This case study is an example of the effectiveness of the proposed method in predicting the most competitive forecasting techniques. Of particular importance is that the application of the method was carried out in a small business, where the three identified issues that motivated this work are more likely to be present: (1) lack of formal statistical training, (2) lack of specialized software and (3) lack of time for analysis.
6. Conclusions and Future Work In this work, a method to select the best time series forecasting technique from a set of known ones was proposed. A simple, yet sufficiently accurate characterization was also developed and tested with promising results. It was described how the artificial neural networks that must be used with the method were built. The validation results showed how the ANNs were effective in finding the best real technique within the best three predicted ones. This was further supported by the elicited confidence interval for proportions. A case study in a small local company was also presented to show the effectiveness of the proposed method. Two contributions to the area of forecasting are important in this work: a selection method that is easy to implement in MS Excel and a simple time series characterization. These contributions have important practical implications since it is precisely through them that it is possible to deal with the three issues in the practice of forecasting identified at the beginning of this work: (1) lack of formal statistical training, (2) lack of statistical software and (3) lack of time to invest on finding the best suited forecasting technique. Future work will include extending the studies to improve the prediction performance of the ANNs, as well as testing the robustness of the ordering of the forecasting techniques. A robust ordering might lead to simpler ANNs and therefore, to more efficient computational implementations. Furthermore, the inclusion of additional non-statistical parameters to the characterization will be investigated and evaluated. Finally, a table with the predictions of the ANNs will be created for ease of distribution and use.
15
References Gardner M. W., and Dorling S. R., “Artificial neural networks (the multi-layer perceptron) - a review of applications in the atmospheric sciences”, Atmospheric Environment, Vol. 33, 1999, 709-719. Gupta J.N.D., Sexton R.S., and Tunc E.A., “Selecting Scheduling Heuristics Using Neural Networks”, Journal on Computing, Vol. 12, No.2, Spring 2000, 150-162. Hansen J. V., and Nelson R. D., “Forecasting and recombining time-series components by using neural networks”, Journal of the Operations Research Society, No. 54, 2003, 307-317. Hillier F.S. and Lieberman G.J., Introduction to Operations Research, 7th Edition, McGraw Hill, 2001, 1009-1052 Hornik K., Stinchcombe M. and White H., “Multilayer feedforward networks are universal approximatiors”, Neuronal Networks, Vol.2, No.5, 1989, 359-366. Inouea A., Kilianb L., “On the selection of forecasting models”, Journal of Econometrics, No.130, 2006, 273–306. Kolehmainen M., Martikainen H., Russkanen J., “Neural networks and periodic components used in air quality forecasting“, Atmospheric Environment, Vol. 35, 2001, 815-825. Kuo R. J. and Xue k. C., “Fuzzy neural networks with application to sales forecasting”, Fuzzy sets and y systems, Vol. 108, No. 2, 1999, 123-143. Makridakis S., Wheelwright S.C., and Hyndman R.J., Forecasting Methods and Applications, 3rd Edition, John Wiley &Sons, Inc., 1998, 42-45, 373. Meande N., “Evidence for the Selection of Forecasting Methods”, Journal of Forecasting, No.19, 2000, 515535. Niska H., Hiltunen T., Karppinen A., Russkanen J. and Kolehmainen M., “Evolving the neural network model for forecasting air pollution time series”, Engineering Applications of Artificial Intelligence, Vol. 17, 2004, 159-167. Prudencio R.B.C., Ludermir T.B., and Carvalho F.A.T., “A Modal Symbolic Classifier for selecting time series models”, Pattern Recognition Letters, No.25, 2004, 911–921. Rumelhart D-E., Hinton G. E., and Willians R. J., “Learning representations by backpropagating errors”, Nature, 323 (6188), 1986, 533-536. Salazar Aguilar M.A, Moreno Rodríguez G.J., and Cabrera-Ríos, M. “Statistical Characterization and Optimization of Artificial Neural Networks in Time Series Forecasting: the One-Period Forecast Case”, Computación y Sistemas, (Forthcoming), 2006 Sexton R. S., McMurtrey S., Michalopoulos J. O., and Smith A. M.,“Employee turnover : a neural network solution”, Computers & Operations Research, Vol. 32, No. 10, 2005, 2635-2651. Venkatachalam A.R., and Sohl J.E., “An Intelligent Model Selection and Forecasting System”, Journal of Forecasting, No. 18, 1999, 167-180.
16
Werbos P. J., “Generalization of backpropagation with applications to a recurrent gas market model”, Neural Networks, Vol. 1, 1988, 339-356. Zhang G. P., Neural Networks in Business Forecasting, Idea Group Publishing, Georgia State University, EUA, 2004. Zhang G., Patuwo E., and Hu Y. M., “Forecasting with artificial neural networks the state of the art”, International Journal of Forecasting, Vol.14, No. 1, 1998, 35-62.
Acknowledgements The authors are grateful to the CONACYT for the scholarship granted to Ms. Villarreal for her graduate studies, as well as to the UANL for Research Grant PAICYT CA1069-05 that helped to support this project.
About the Authors María Guadalupe Villarreal Marroquín obtained her B.S. in Mathematics from Universidad Autónoma de Nuevo León (UANL) (2006) and is currently pursuing her studies towards the M.S. degree in the Graduate Program in Systems Engineering also at UANL. Her interests include Operations Research and Applied Math.
Mauricio Cabrera Ríos obtained his B.S. in Industrial and Systems Engineering from ITESM-Monterrey (1996), and his M.S. and Ph.D. in the same discipline from The Ohio State University (1999, 2002). He is an Associate Professor at the Graduate Program in Systems Engineering at the Universidad Autónoma de Nuevo León, México.
17