IJICIS, Vol. 5, No. 1, July 2005
CLIMATE CHANGE PREDICTION USING DATA MINING N. Shikoun
H. El-Bolok
[email protected] om
Communication, lectronics & Computer Deprtmant, Faculty of Engineering, Helwan University, Egypt
M. A. Ismail Communication, lectronics & Computer Deprtmant, Faculty of Engineering, Helwan University, Egypt
M. A. Ismail Information Technology Deprtmant, Faculty of Computers & Information, Cairo University, Egypt.
Abstract: Great progress has been made in the effort to understand and predict El Nino, the anomalous warming of the sea surface temperature (SST) along the equator off the coast of South America which has a strong impact on the climate change over the world. Advances in improved climate predictions will result in significantly enhanced economic opportunities, particularly for the national agriculture, fishing, forestry and energy sectors, as well as social benefits. This paper presents monthly El Nino phenomena prediction using artificial neural networks (ANN). The procedure addresses the preprocessing of input data, the definition of model architecture and the strategy of the learning process. The most important result of this paper is finding out the best model architecture for long term prediction of climate change. Also, an error model has been developed to improve the results. Keywords: Data mining, El Nino, Neural Network, Back- propagation
1. INTRODUCTION Better predictions of the potential for extreme climate episodes like floods and droughts could save billions of dollars in damage costs. Predicting the life cycle and strength of a Pacific warm “El Nino” or cold episode “La Nina” is critical in saving water, energy and transportation and helping managers and farmer’s plans to avoid or mitigate potential losses. [4,17,19]. The field of data mining and knowledge discovery is emerging as a new fundamental research area with important applications to science, engineering, medicine, business, and education. Data mining is generally an iterative and interactive discovery process. The goal of this process is to rule, associations, changes, anomalies, statistically significant structures, and mine patterns from large amount of data. Furthermore, the mined results should be valid, novel, useful, and understandable [22]. Data mining is the central step in a process called knowledge discovery in databases, namely the step in which modeling techniques are applied. Several research areas like statistics, artificial intelligence, machine learning, and soft computing have contributed to its arsenal of methods [8]. The objective of this paper is to apply neural network as a technique to predicate El Nino phenomena for long term. A model of errors has been developed to adjust the accuracy of results. The organization of the paper is as follows: Section 2 describes the nature of El Nino event. Techniques used in data mining for predication is covered in section 3. Section 4 explains using neural
365
Nabila et al: Climate Change Prediction Using Data Mining
networks for prediction on the monthly data of El Nino event. Section 5 shows the proposed forecasting system network configuration, results and discussion. Finally section 6 presents the conclusion. 2. EL NINO PHENOMENON El Nino is the name given to the phenomenon which occurs when sea-surface temperatures (SSTs) in the equatorial Pacific Ocean off the South American coast becomes warmer than normal. These persisting warm SSTs influence the atmospheric circulation and consequently change climate patterns globally. La Nina is the opposite of El Nino it exists when cooler than usual ocean temperatures occur. The El Nino/La Nina "cycle" does not occur with strict periodicity. Historically, an El Nino usually recurs every 3-7 years, as does its (cold) La Nina counterpart. Closely associated with El Nino and La Nina is another meteorological phenomenon called the Southern oscillation. It refers to a very large scale to exchange in sea-level air pressure between areas of the western and the southeastern Pacific. The combination is now called El Nino/Southern Oscillation, or ENSO. El Nino defines the warm phase of ENSO, and La Nina defines the cold one [18].
El Nino is divided into three regions: NINO 1+2, NINO 3 and NINO4. NINO 1+2 from latitude
0ο to 10 ο S and longitude from 80 οW to 90 οW , NINO 3 from latitude 5ο N to 5ο S and longitude ο ο ο ο ο ο from 90 W to 150 W , and NINO 4 extends from latitude 5 N to 5 S and 150 W to 160 W ,
longitude. NINO 3 is explained in this paper. 2.1. Review of Literature For more than a decade, the Lamont-Doherty Earth Observatory LDEO model of Columbia University using a simple coupled ocean-atmosphere dynamical model [3,18] has played an important role in understanding and prediction of the Nino 3 SST anomaly. However, the predictive skill of the original Lamont model (LDEO1) is severely limited by its unbalanced initialization scheme, its sole dependence on wind data and its large systematic biases. In the last few years, the improvements in model initialization have taken into consideration, data assimilation, and bias correction, resulting in LDEO2, LDEO3 and LDEO4. Most recently, further improvements to the model have been done by introducing a statistical correction term in the model SSTs equation. It is now more straightforward to assimilate data for model initialization because of the much reduced model-data incompatibility. The new version of the model not only performs better in retrospective forecasting, but also exhibits a more realistic internal variability [4]. Recently “Hybrid Coupled Model” has been used to forecast Nino 3 area to data from January 1950 to May 2004. This model has coupled with simpler statistic ocean model and atmospheric model. The forecast is for up to one year ahead [13]. The prediction of El Nino is difficult according to the following factors [16,19]: 1. There is no conditions that make it to occur, once an El Nino has started, gives reasonably good skill in predicting the subsequent evolution over the next 6-9 months, but before it has started very little skill in predicting the onset before the event has become obvious. 366
IJICIS, Vol. 5, No. 1, July 2005
2. There are a variety of theories for why El Ninos starts, but non of them has given real skill in making a forecast in advance There aren’t many studies concerning El Ninos, as it usually occurs irregularly, 3. approximately every 4-5 years. The importance of the better predicates El Nino not to control or modify the ENSO cycle, but to adapt to its consequences [23].
2.2. Global Effect of El Nino ENSO is associated with shifts in the location and intensity of deep convection and rainfall in the tropical Pacific. During El Nino events, drought conditions prevail in northern Australia, Indonesia, and Philipines, and excessive rains occur in the island states of the central tropical Pacific and along the west coast of South America. Shifts in the pattern of deep convection in the tropical Pacific also affect the general circulation of the atmosphere and extend the impacts of ENSO to other tropical ocean basins. During El Nino most of Canada and the northwestern United States tend to experience mild winters, and the states bordering the Gulf of Mexico tend to be cooler and wetter than normal. California has experienced a disproportionate share of episodes of heavy rainfall during El Nino winters such as 1982- 1983, 1991-1992, and 1994-1995. Atlantic hurricanes tend to be less frequent during warm events and more frequent during cold events. El Nino events also disrupt the marine ecology of the tropical Pacific and the Pacific coast regions of the Americans, affecting the mortality and distribution of commercially valuable fish stocks and other marine organisms [4,13,16].
3. DATA MINING CONCEPTS Knowledge discovery in databases (KDD) is iterative and interactive process. It is discovering interesting knowledge, such as patterns, associations, changes anomalies and significant structures from large amounts of data stored in databases. KDD is comprised of many steps, which involve data preparation, data mining, knowledge evaluation and interpretation, and refinement, all repeated in multiple iterations. Data mining is the particular step in this process where specific algorithms for extracting patterns from data are applied [8,9].
Prediction techniques Data mining techniques include a variety of methods: predictive modeling, clustering, association mining, and change and deviation detection. Predictive modeling includes classification for categorical predictions and regression analysis for numerical predictions [7,14]. There is more than one model that can be applied in prediction task. These models can be classified into two categories named parametric and data driven. Parametric methods usually utilize the idea of parameter estimation in statistics, it includes regression, discriminant analysis techniques, and autoregressive integrated moving average model (ARIMA) [2]. Data driven approach, used in solving problems with little knowledge about the statistical properties of the data and is better for problems with complex nonlinear data relationships [14,15]. This approach uses methods to employ the power of the computer to search and iterate until it achieves a good fit to the data [20,22], like decision trees, neural networks, k- nearest neighbors and genetic algorithm. In this paper the technique based on data-driven model by applying neural network is used.
4. NEURAL NETWORK An artificial neural network is a computing method inspired by structure of brains and nerve systems. A typical neural network consists of a group of inter-connected processing units, which are 367
Nabila et al: Climate Change Prediction Using Data Mining
also called neurons. Each neuron makes its independent computation based upon a weighted sum of its inputs, and passes the results to other neurons in an organized fashion. Neurons receiving input data form the input layer, while those generate output to users form the output layer. A neural network must be trained by data for a certain problem. The training process is to adjust the connecting weights between neurons so that the network can generalize the features of a problem and therefore to obtain desired results. A neural network is trained from training data set. This makes neural network a desirable tool in dealing with complex systems [12].
4.1. Types of Neural Network Neural networks can be classified into two categories according to the type of connections between the neurons. These are: feed forward and recurrent (or feedback). In feed forward networks, connections between neurons are forward, going from the input layer to the output layer. In recurrent (or feedback) networks, there are feedback connections between different layers of the network. Neural networks can also be classified into First order and Higher order/polynomial according to the order of the input send to neurons. First order networks send weighed sums of inputs through the transfer functions. Higher order/polynomial networks send weighed sums of products or functions of inputs through the transfer functions [12]. Neural networks must be trained to find the optimal neural network structure for a solution of a particular problem. Training algorithms can be divided into two main categories: supervised and unsupervised. In supervised learning, the network is provided with example cases and desired responses. The network weights are then adapted in order to minimize the difference between network outputs and desired outputs. In unsupervised learning, the network is given only input signals, and the network weights change through a predefined mechanism, which usually groups the data into clusters of similar data without outside help [12]. The Back-propagation algorithm is mostly used in training phase in supervised learning. The process of determining the best suitable network configuration and the best parameters for a given application is trial-and-error, especially when the relationships between the variables are not well understood.
4.2. Back-Propagation Most neural network approaches to the problem of forecasting use a multilayer network trained using the back-propagation algorithm.
Μ
Input layer
Μ
Μ
Hidden layer
Output layer
Fig 1. Schematics of a three-layer back-propagation neural network
368
IJICIS, Vol. 5, No. 1, July 2005
Figure 1 shows the topology of a three layers network. The first layer of the network is the input layer, the only units in the network that receive input data. The second layer is called hidden layer, in which the processing units are interconnected to layers right and left. The third layer is the output layer. Each processing unit is connected to every unit in the right layer and in the left layer, but it is not connected to other units in the same layer. A back-propagation network can have one or more than one hidden layers, although many have one or two hidden layers [5,10,21]. Back-propagation training consists of three steps: firstly present all examples input to the network inputs and run the network compute activation functions sequentially forward from the hidden layer to the output layer; secondly compute the difference between the desired output and the actual network output, propagate the error sequentially backward from the output layer to the hidden layer hence the term back-propagation; thirdly for every connection, change the weight modifying that connection in proportion to the error. When these three steps have been performed, one epoch has occurred. Training usually stops when a predetermined maximum number of epochs are reached or the network output error falls below an acceptable threshold [12,21].
5. PROPOSED FORECASTING SYSTEM AND RESULTS This section shows the proposed forecasting system (PFS) which is divided into two subsections: the structure of (PFS) and the experimental results. The PFS subsection explains the block diagram of the system. The experimental results subsection discusses in details the results obtained. It also explains two methods to get the error model. 5.1 The Proposed Forecasting System (PFS):
Figure 2 shows the main steps for implementation of the (PFS). These are: preprocessing step and neural network architecture with its learning process. Input data
Normalized Preprocessing
Neural network architecture
values
& Learning process
Best model
Figure 2. Block diagram of the proposed forecasting system (PFS)
Preprocessing: The main function of this step is converting the input data to values ranging from zero to one through the normalization process. The input data is the monthly historical anomalies Nino3. It is obtained from the bank of data in the period of January 1951 to April 2004 [25]. A monthly El Nino data consists of sequences of values in time, which called time series data. This data is normalized to the range from 0 to 1 to improve the performance of the neural network according to the following equation [11] v − min A NORM v = ∗ (new _ max A − new _ min A ) + new _ min A (1) max A − min A
where NORM v is the normalized data, v is the original data of attribute A (Nino 3 data), max A is the maximum value of an attribute A, min A is the minimum value of an attribute A. A min-
369
Nabila et al: Climate Change Prediction Using Data Mining
max normalization maps a value v of attribute A to NORM v in the range [ new _ min A , new _ max A ], new _ min A is set to 0.0 and new _ max A to 1.0.
Neural network architecture and learning process: The architecture of the neural network used for prediction El Nino phenomena is a multilayer feed forward network. It consists of an input layer, one hidden layer and an output layer. The input layer contains the inputs to be given to the network; the size of input layer is fixed for each experiment. The number of neurons in input layer in each experiment is two or three or four. The hidden layer consists of nonlinear neurons and performs nonlinear transformations by using sigmoid function. This function also performs further normalization of the data to the range from 0 to 1. The size of hidden layer is variable. The best way to determine the size is trial-and-error. In each experiment the size ranges from one neuron to 10 neurons to output the best accuracy. The output layer consists of one neuron producing the predicated value. The back-propagation algorithm is used to train the network. It is an iterative gradient descent algorithm designed to minimize the mean square error between the actual output of a multilayer feed forward neurons and the desired output by adapting the weights [12]. The weight is adapted according to the following equation 1 C 2 E C = ∑ ( Dc − Oc ) (2) 2 c =1
where C is the number of nodes in the output layer, D C is the desired network output, and O C is the actual network output. The batch mode of training is used where weight updating is performed after the presentation of all examples. The performance of the neural network is evaluated by the normalized mean square error (NRMSE) test. It is used to compare predicated estimates of the observation with the actual observed values according to this equation
∑ [x(t ) − xˆ(t )] ∑ x (t )
2
NRMSE =
(3)
2
where xˆ (t ) is the forecast of x(t ) . The network is trained for 50000 iterations. The learning process of the neural network was done on a Matlab package.
5.2. Experimental Results This section shows the research experimental results through three subsections. The first one detects the best architecture model of the network which gives the best accuracy through three experiments. The second subsection gets the error model of the best topology of network to adjust the accuracy. This is done through two experiments. The third gets the error model after eliminating the extreme values from the original values through two steps 5.2.1. Best topology model of neural network
This section explains the experiments that were done to find the best topology model of neural network with three experiments. It was applied on the Nino3 monthly values. These values were normalized from range 0 to 1 as mentioned in section 5.1. The range of this data covers the period from January 1951 to April 2004 which represents 639 months. These months were divided into two
370
IJICIS, Vol. 5, No. 1, July 2005
ranges: training range and testing range. The training range represents about 65% of total period. The testing range represents about 35% from total period. Experiment 1:
In this experiment, the architecture of the neural network was as follows: two nodes in the input layer with 420 examples in the training phase, and 217 examples in testing phase, the number of nodes used in the hidden layer were 1, 2, 3, 4, 5 and 10 respectively. Table 1 below shows the results. Table 1: Training error and testing error for two input combination
Testing NRMSE 0.13367 0.12479 0.12539 0.12538 0.12421 0.14953
Training NRMSE 0.16103 0.15911 0.15901 0.15894 0.15839 0.15270
Epoch 361 793 349 736 1476 15809
Arc. N.N.
Time 7.45 18.637 8.492 20.059 40.009 565.663
2,1,1 2,2,1 2,3,1 2,4,1 2,5,1 2,10,1
Table 1 indicates that the best result from architecture (2:5:1) where the NRMSE for training is 0.15839 and the NRMSE for testing is 0.12421. Observed values Predicted values
0.8 0.6 0.4
211
201
191
181
171
161
151
141
131
121
111
91
101
81
71
61
51
41
31
21
0.0
11
0.2 1
Nino 3 normalized values
1.0
Months
Fig 3. Predicted versus actual Nino 3 event in testing phase [Exp. 1]
Figure 3 represents the actual and predicted Nino 3 event after applying the back-propagation training algorithm on experiment one. Experiment 2:
371
Nabila et al: Climate Change Prediction Using Data Mining
This experiment used three nodes in input layer with 420 examples for the training phase, and 216 examples for testing phase in back-propagation learning neural network. The number of nodes used in the hidden layer was 1, 2, 3, 4, 5 and 10 respectively. Table 2 below shows the results.
Table 2: Training error and testing error for three input combination
Training NRMSE 0.15936 0.15756 0.15692 0.15621 0.15458 0.14799
Epoch 903 731 1226 1893 3139 8654
Time 15.297 14.484 25.046 40.312 71.14 246.406
Arc. N.N. 3,1,1 3,2,1 3,3,1 3,4,1 3,5,1 3,10,1
211
201
191
181
171
161
151
141
131
121
111
101
91
81
71
61
51
41
31
21
Observed values Predicted values
1
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
11
Nino 3 normalized values
Testing NRMSE 0.13169 0.12417 0.12409 0.12559 0.12507 0.14271
Months
Fig 4. Predicted versus actual Nino 3 event in testing phase [Exp. 2]
Figure 4 represents the actual and predicted Nino 3 event after applying the back-propagation training algorithm on experiment 2. Experiment 3: This experiment used four nodes in input layer with 420 examples for the training phase, and 215 examples for testing phase in back-propagation learning neural network. The number of nodes used in the hidden layer was 1, 2, 3, 4, 5 and 10 respectively. Table 3 below shows the results.
372
IJICIS, Vol. 5, No. 1, July 2005
Table 3: Training error and testing error for four input combination
Testing NRMSE 0.13090 0.13015 0.12756 0.12557 0.12618 0.12743
Training NRMSE 0.15823 0.15638 0.15330 0.15420 0.15411 0.15265
Epoch 1399 2881 1646 3638 5063 9120
Time 30.174 69.851 39.908 91.973 140.001 262.717
Arc. N.N. 4,1,1 4,2,1 4,3,1 4,4,1 4,5,1 4,10,1
Table 3 shows the best result were obtained from neural network architecture (4:4:1). Observed values Predicted values
0.8 0.6 0.4
211
201
191
181
171
161
151
141
131
121
111
91
101
81
71
61
51
41
31
21
0.0
11
0.2 1
Nino 3 normalized values
1.0
Months
Fig 5. Predicted versus actual Nino 3 event in testing phase [Exp. 3]
Figure 5 depicts the actual and predicted Nino 3 event after applied the back-propagation training algorithm on experiment 3. From the three preceding experiments it is clear that the best topology is (3:3:1) which gives the best accuracy. Therefore this topology was used to get the model of error in the next parts of the experiments. 5.2.2 The First model error: This section proposes the model of error without eliminating the extreme values in Nino 3, to correct the predicted values by adding deviation values. Neural network is used again to do that. The input to the network is the predicted values from experimental two. The target values are the error values calculated from the difference between targeted values and predicted values from experiment two. This section contains two experiments: experiment 4 and experiment 5. Experiment 4: This experiment was used to find the best architecture for the error model by using neural network. It used two nodes in input layer with 420 examples for the training phase, and 215 examples for testing phase in back-propagation learning neural network. The number of nodes used in the hidden layer was 1, 2, 3, 4, 5 and 10 respectively. Table 4 below shows the results.
373
Nabila et al: Climate Change Prediction Using Data Mining
Testing NRMSE 0.06910 0.05989 0.04654 0.04797 0.04816 0.06913
Training NRMSE 0.06073 0.05274 0.04893 0.04882 0.04759 0.03344
Epoch 593 1758 749 2063 2479 28753
Arc. N.N. 2,1,1 2,2,1 2,3,1 2,4,1 2,5,1 2,10,1
Time 10.516 35.422 16.062 48.968 60 1.07E+03
Table 4: Training error and testing error for model of error (Exp. 4)
Table 4 shows that the best architecture is (2:3:1). The maximum percentage error before experiment four was 292.248% and damped to 23.914% after applying this experiment. target values
211
201
191
181
171
161
151
141
131
121
111
101
91
81
71
61
51
41
31
21
11
1
Nino3 normalized values
predicted values+corrected error
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0
Months
Fig 6. Nino 3 actual values versus corrected predicted values in testing phase[Exp. 4]
Figure 6 represents the actual and predicted values after adding the error values by applying the back-propagation training algorithm on experiment 4. Experiment 5:
This experiment used three nodes in input layer with 420 examples for the training phase, and 214 examples for the testing phase in back-propagation learning neural network. The number of nodes used in the hidden layer was 1, 2, 3, 4, 5 and 10 respectively. Table 5 shown below the results.
374
IJICIS, Vol. 5, No. 1, July 2005
Table 5: Training error and testing error for model of error (Exp. 5)
Testing NRMSE 0.06913 0.05589 0.05193 0.04592 0.04599 0.08355
Training NRMSE 0.06071 0.05279 0.05154 0.04871 0.04768 0.03258
Epoch 1350 2462 3613 1920 4180 31315
Arc. N.N. 3,1,1 3,2,1 3,3,1 3,4,1 3,5,1 3,10,1
Time 24.954 49.453 77.922 55.39 101.75 1.13E+03
Table 5 indicates that the best architecture is (3:4:1). The maximum percentage error before experiment five was 292.248%. After applying this experiment the error damped to 23.914%. target values
211
201
191
181
171
161
151
141
131
121
111
101
91
81
71
61
51
41
31
21
11
1
Nino3 normalized values
predicted values+corrected error
1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 Months
Fig 7.Nino 3 actual values versus corrected predicted values in testing phase [Exp. 5]
Figure 7 represents the actual and predicated values after adding the error values by applying the back-propagation training algorithm on experiment 5. The NRMSE is approximately equivalent for both experiments four and five. But the architecture of experiment four is simple because it takes less time and less number of iterations (epoch) in the learning phase. 5.2.3 The second error model:
To construct the second error model, two steps must be followed as shown in figure 8. The two steps are: calculate the percentage error (PE) and applying the neural network to get the best error model. Error without Calculation of PE
odd values
Neural network architecture
& Learning process
error values (predicted-target)
Fig 8. Block diagram of the second error model 375
Best error model
Nabila et al: Climate Change Prediction Using Data Mining
Step 1: In this step the percentage error (PE) calculated from the best architecture in experiment two (3:3:1) of the test phase according to this equation
PE =
xi − xˆ i
*100.0 xi where xi is the actual value, xˆ i is predicted value for each example tested.
(4)
Figure 9 shows the percentage error (PE). As shown from figure there are few points with percentage error greater than (30%). Thus, the threshold value will be equal and less than 30%. values of % error
208
199
190
181
172
163
154
145
136
127
118
109
100
91
82
73
64
55
46
37
28
19
10
330.0 300.0 270.0 240.0 210.0 180.0 150.0 120.0 90.0 60.0 30.0 0.0 1
Percentage Erroro
Percentage error for test phase
Months
Fig 9. El Nino Percentage error
Step 2: In this step the best architecture for the error was found using neural network. That’s done by using two combinations two and three of input layer. For two nodes combination 375 examples were used in the training phase, and 188 examples in the testing phase in back-propagation learning neural network. The number of nodes used in the hidden layer was 1, 2, 3, 4, 5 and 10 respectively. Table 6 below show the results. It is clear from this table that the best architecture is (2:4:1). Testing NRMSE 0.10774 0.09161 0.09164 0.08055 0.12961 0.16122
Training NRMSE 0.17863 0.15783 0.15763 0.14297 0.14851 0.13902
Epoch 1345 5673 4038 21882 17998 50000
Time 26.979 124.039 65.224 222.75 480.992 2.01E+03
Arc. N.N. 2,1,1 2,2,1 2,3,1 2,4,1 2,5,1 2,10,1
Table 6: Training error and testing error for model of error
For three nodes combination 375 examples were used for the training phase, and 187 examples for the testing phase in back-propagation learning neural network. The number of nodes used in the
376
IJICIS, Vol. 5, No. 1, July 2005
hidden layer was 1, 2, 3, 4, 5 and 10 respectively. Table 7 below show the results. It is clear from this table that the best architecture is (3:4:1). Table 7: Training error and testing error for model of error
Testing Training Arc. Epoch Time NRMSE NRMSE N.N. 0.10777 0.17862 514 10.816 3,1,1 0.09178 0.15462 5000 10.765 3,2,1 0.09179 0.15459 6339 152.84 3,3,1 0.08729 0.15075 34248 949.135 3,4,1 0.10992 0.16268 2595 66.946 3,5,1 0.18051 0.13728 50000 1.78E+03 3,10,1 The best topology seen from tables 6 and 7 is (2:4:1) model. The NRMSE before applying this model of error was 0.10074 and damped to 0.02594 after applying it in testing phase. Figure 10 below represents the model error for the best topology (2:4:1). target values(original)
1.0 0.8 0.6 0.4 0.2
185
177
169
161
153
145
137
129
121
113
105
97
89
81
73
65
57
49
41
33
25
9
17
0.0 1
Nino3 normalized values
predicted values+corrected error
Months
Fig 10. Nino 3 actual values versus corrected prediction values in testing phase
6. CONCLUSION This paper applies neural network as a data mining technique for forecasting El Nino3 phenomena. The neural network back-propagation algorithm has been applied in the implementation of the experiments. There are three experiments that have been done in forecasting phase. Different architecture of input layer and hidden layer were used in these experiments. The NRMSE is used as a measure to choose the best architecture for forecasting. The best topology according to NRMSE measure was (3:3:1). The maximum percentage error was 292.24%. The second achievement of the research is to improve the accuracy of forecasting. To achieve this goal a model of error has been implemented using two methods. The neural network backpropagation algorithm has been applied again in the implementation of these methods. By using the first method of error the maximum percentage error was damped to 23.914%. The second method depends on elimination of the extreme values from the best forecasting model (3:3:1). Applying the 377
Nabila et al: Climate Change Prediction Using Data Mining
neural network to these values the best model of error was (2:4:1) which damped NRMSE from 0.10074 to 0.02594.
REFERENCES [1]
[2] [3] 4] [5] [6] [7] 8] [9] 10] [11] [12] [13] [14] [15] [16] [17]
Behnke J., Dobinson E., Graves S., Hinke T., Nichols D., and Stolorz P, Final Report for NASA Workshop on Issues in the Application of Data Mining to Scientific Data, 1999. Box, G.E.P., Jenkins, G.M.,” Time Series Analysis: Forecasting and Control”, HoldenDay, San Francisco, CA, 1976. Cane M. A., Zebiak S. E., and Dolan S. C., “Experimental forecasts of El Nino”. Nature, 321, 827-832, 1986 [4] Chen, D., Zebiak S. E. and Cane M. A, “Experimental Forecast with the latest Version of the LDEO Model”, http://grads.iges.org/ellfb/Mar05/chen/chen.htm, 2004. Clausen M., Korner H., & Kurth F., “An Efficient Indexing and Search Technique for Multimedia Databases”. SIGIR Multimedia Information Retrieval Workshop, 2003 Duba R.,Hart P., and Stork D.,”Pattern Classification”, 2nd Edition.Wiley Interscience, 2000. Ert¨oz L., Steinbach M., and Kumar V., “Finding Clusters of Different Sizes, Shapes, and Densities in Noisy, High Dimensional Data”, In Proc. of 3rd SIAM International Conference on Data Mining, San Francisco, CA, USA, 2003 [Fayyad U., Piatetsky Shapiro G., Smyth P., and Uthurusamy R., “Advances in Knowledge Discovery and Data Mining”, MIT Press, Cambridge, MA, 1996. Frawley W.J., Piatetsky-Shapiro G., and Matheus C.J., “Knowledge discovery in databases: an overview”, In G. Piatetsky-Shapiro and W.J. Frawley (eds), Knowledge Discovery in Databases , AAAI/MIT Press, Menlo Park, CA, pp. 1-27, 1991. [Grossman H., Kamath C., Kegelmeyer P., Kumar V., and Namburu R., “Data Mining for Scientific and Engineering Applications”, Kluwer Academic Publishers, 2001. Han J., and Kamber M., “Data Mining: Concepts and Techniques”, Morgan Kaufmann, San Francisco, 2001. Haykin S., “Neural Network: A Comprehensive Foundation”, Second Edition, PrenticeHall, Inc, 1999. Kondrashov D., Chil M., and David N. J., “Forecasts of Nino-3 SST anomalies and SOI based on singular spectrum analysis combined with maximum entropy method”, Geophys, Physics, University of California, Los Angeles, California, 2004. Liu Y., “A framework of data mining application process for credit scoring”, Institut für Wirtschaftsinformatik, Georg-August-University Gِttingen, http://www.wi2.wiso.unigoettingen.de/getfile?DateiID=394 ,2002. Moxon B., “Defining Data Mining. In: DBMS ONLINE, DBMS Data Warehouse Supplement”, URL: http://www.dbmsmag.com/9608d53.html, 1996. Penland C., and Sardeshmukh P. D., “The optimal growth of tropical sea surface temperature anomalies”, J. Clim., 8, 1999-2024, 1995. Potter C., Klooster S., Steinbach M., Tan P., Kumar V., Shekhar S., Nemani R., and Myneni R, “Global Teleconnections of Ocean Climate to Terrestrial Carbon Flux”, 378
IJICIS, Vol. 5, No. 1, July 2005
Journal of Geophysical Research, 108, 2003. Quin W. H., Neal V. T., and Antunez S. E. de Mayolo, “El-Nino occurrences over the past four and a half centuries”, J. Geophys. Res., 92, 449-461, 1987. Steinbach M., Tan P., Kumar V., Potter C., and Klooster S., “Data Mining for the Discovery of Ocean Climate Indices”, In Proc of the Fifth Workshop on Scientific Data Mining, 2002. Weiss S. M. and Kulikowski C. A., “Computer systems that learn: Classification and prediction methods from statistics, neural nets, machine learning, and Expert Systems”, Morgan Kaufmann Publishers, Inc. San Mateo, California, 1991. Werbos P., “Back-propagation through time what it does and how to do it”, Proc. IEEE, Vol.7, 1990. Zaki. M. J., and Hsiao. CHARM C. J., “An efficient algorithm for closed itemset mining”, In 2nd SIAM Internationa Conference on Data Mining, 2002.
[18] [19] [20] [21] [22] [23]
Zebiak, S. Ropelewski, C. and Edward S., “El Nino and the Science of Climate Prediction”, The nature and implications of environmental change, A review assessment published in Consequences, vol 5 no 2, pp. 3-15, http://www.gcrio.org/index.htm, 1999. Zurada J., and Kantardzic M., “Next Generation of Data Mining Applications”, WileyIEEE Press, 696 pp., 2004.
[24] .
379