A fast and independent architecture of artificial neural network for ...

2 downloads 0 Views 593KB Size Report
Department of Mining, Metallurgy and Petroleum Engineering, Amirkabir University ... Journal of Petroleum Science and Engineering 86–87 (2012) 118–126.
Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

Contents lists available at SciVerse ScienceDirect

Journal of Petroleum Science and Engineering journal homepage: www.elsevier.com/locate/petrol

A fast and independent architecture of artificial neural network for permeability prediction Pejman Tahmasebi ⁎, Ardeshir Hezarkhani Department of Mining, Metallurgy and Petroleum Engineering, Amirkabir University (Tehran Polytechnic), Hafez Ave. No. 424, Tehran, Iran

a r t i c l e

i n f o

Article history: Received 6 August 2011 Accepted 14 March 2012 Available online 26 March 2012 Keywords: modular neural networks petroleum reservoir petrophysical data data integration permeability prediction

a b s t r a c t Permeability is one of the most important parameters of the hydrocarbon reservoirs which represent and control the production and flow paths. Different direct and indirect methods try to measure this parameter which most of them, such as core analysis, are very time and cost consuming. Therefore, applying an efficient method which can model this important parameter is necessary. One of these methods which recently have been used frequently is artificial neural networks (ANNs) which have a significant ability to find the complex spatial relationship in the existence parameters of reservoir. Despite all of the applications of ANNs, most of them model the whole reservoir together and one should separate the different domains and use different networks. Also, most of them suffer from not using a priori knowledge or other source of data efficiently. Furthermore, the previous networks when encountering with very large dataset are slow and CPU demanding and they missed their accuracy when a few data are available. Therefore, all of these limitations lead us to use the modularity concept which is browed for biological system to address those problems. Thus, to mitigate these problems, a modular neural network (MNN) is presented. For this aim, one of Iran's oil field which contains three wells was selected for this application. Therefore, different multilayer perceptron and MNN were compared. In other words, the proposed method along four different architectures was used to predict the permeability and the obtained results were compared statistically. According to the obtained results when compared with traditional multilayer perceptron (MLP), this new method is promising very low computational time, the ability to encounter with complex problems, high learning capacity and affordability for most of the applications. The results show that the R2 was improved from 0.94 to 0.99 for MLP and MNN networks, respectively. © 2012 Elsevier B.V. All rights reserved.

1. Introduction Permeability is one of the most important parameters in the petroleum reservoirs where its accurate value can have a great effect on production and management procedures. Therefore, we tried to use some direct measurements, such as experimental methods on cores, to find the permeability values. These methods are very useful, but they are not sufficient to show the heterogeneity of reservoir, because due to intensive time demanding and high cost, it is possible to drill only limited wells. Therefore, presenting a method by which can be able to predict the petrophysical properties, particularly permeability, is necessary. Also, another restriction can be due to unavailability of cores, missed cores in certain intervals and etc. which leads us to use some methods to predict them. Furthermore, most of the available logging tool operations cannot measure the permeability directly and one should interpret them. Also, several researchers tried to find a relationship between the widely available

⁎ Corresponding author. Tel.: + 98 21 64542968; fax: +98 21 66405846. E-mail address: [email protected] (P. Tahmasebi). 0920-4105/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.petrol.2012.03.019

parameters in reservoir with permeability (e.g. depth and porosity) to reach the permeability indirectly, which one of these methods is nonlinear regression. However, these equations in the most of the conditions due to high reservoir heterogeneity are not reliable. For this aim, using a method which is able to predict the permeability in different heterogeneity conditions of reservoir is necessary. In the recent years, the applications of artificial intelligent methods due to their intrinsic abilities to capture the nonlinearity and complex heterogeneity in reservoir have been widespread and can finds several applications such data mining, prediction, risk assessment, uncertainty quantification and data integration (Aminian et al., 2000; Aminian et al., 2001; Asadisaghandi and Tahmasebi, in press; Ghezelayagh and Lee, 1999; Gorzalczany and Gradzki, 2000; Jagielska et al., 1999; Karimpouli et al., 2010; Mohaghegh, 1994; Mohaghegh et al., 2001; Nikravesh, 2004; Sahimi, 2000; Saemi and Ahmadi, 2008; Saemi et al., 2007; Tahmasebi and Hezarkhani, 2010a,b). Among all of the artificial intelligent methods (e.g. artificial neural network (ANN), fuzzy logic (FL), genetic algorithm (GA)…), ANN due its flexibility and ability to solve the nonlinear problems, can find more applications. However, most of ANNs need a time consuming

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

procedure of architecture design and the problem of local minima which leads the ANN to be used conservatively. Also, to escape of trapping the network in local minima, one can combine the ANN with GA (Tahmasebi and Hezarkhani, 2010a). In spite of wide applications and combinations of ANN, the efficiency of ANN will be decreased when encounter with a few data and/or very complex situation in which the data are very noisy and finding a spatial relationship is very difficult. Hence, the efficiency of designed network mostly depends on its the learning algorithm, topology and data distribution and these factors will change from one dataset to another one. Therefore, a significant amount of efforts and time should be consumed to find the optimum network. One can assume the ANN as a optimization problem in which the variable are input–output, weights, learning coefficients (e.g. learning and momentum rate) and etc. that by using a learning algorithm that this problem can be solved that tries to achieve a network that can predict the output accurately. Therefore, it is possible to look at ANN as a procedure in which by adjusting some of parameters, the desired output should be resulted. Definitely, the convergence of optimization is tied with the number of variables and the availability of informative data to convey the complexity. Thus, if an optimization problem has a few data and/or a lot of variables, therefore, the optimization method cannot reach the global minima (or maxima) and it may be trapped in the local minima (or maxima). For this aim, one can reduce the involved variables in order to escape from local areas. In the most of the previous studies, for solving the problem of local minima and the lack of data, they mostly combined the ANN with a global learning algorithm such simulated annealing or genetic algorithms. However, since these methods are iterative and for a good convergence need a lot of data, they themselves increase the complexity of the problem in the hand. Also, there is no clear attention to reduce the complexity of ANN. The aim of this paper is to reduce the complexity of ANN by applying different structures of ANN to achieve some advantages. Also, it focused on finding an optimal architecture of network which can satisfy a network with low computational cost and can encounter with a situation in which we do not have a lot of data. Therefore, the main advantages of the current study can be summarized as follows. Due to an efficient complexity reduction in ANN, the code should be fast and have the ability to use a few examples and an escaping form of local minima. One of the other aims of this paper is to introducing a new concept called modularity in petroleum industry by which one can achieve a lot of advantages. Furthermore, there is no related study in earth science about the use this type of ANN. The only related study is done by Tahmasebi and Hezarkhani (2010a) in which they modeled the complex spatial relationship in mining tasks for grade estimation. The rest of this paper is organized as follows. In Section 2 the proposed methodology will be explained. Section 3 presents a case study by which the accuracy of the proposed methodology is evaluated. Section 4 describes the results and discussion including of some statistical comparison to show the efficiency of the proposed method and the results will be compared with the trial-and-error based ANN. Also, the paper is summarized in Section 5.

119

were studies widely. Mathematically speaking, one can assume the nervous systems of human as large number of elements which mainly arranged in different layers. A schematic description of this architecture can be seen in Fig. 1. According to this figure, one can see three main layers, input layer, hidden layers and an output layer. Therefore, the input signal will propagate through the layers which include a lot of elements and produce the output. Obviously, the main role of hidden layers is to find the spatial relationships between the input and output. Based on an iterative scheme and similar to human's techniques, the associated weights and biases will adjusted to produce the desired output. Therefore, the output response is a combination between the input, weights, biases and hidden layers elements. Also, in the hidden layer and output layer a function which mostly is the sigmoid function will be used to compute the output (Bean and Jutten, 2000). Obviously, the algorithm based on an assumed iteration and by calculating the produced error which is the difference between the real and estimated values will be continued. Finally, according to some predefined criteria such as the amount of error tolerance and/or number of iterations, the final network will be obtained (Bishop, 1995). Therefore, if the produced error was more the predefined error, the backward propagation accrued in which the weights should be changed in such a way the error was decreased. 2.2. Multi-layer perceptron (MLP) MLP is one of the most prevalent feedforward ANNs which by using the described methodology in Section 2.1, the inputs maps to output. Usually, the internal layers of MLP are fully connected with each other. Basically, there are two types of learning, supervised and unsupervised. It is clear that in a supervised learning procedure, the desired output for each of the training inputs is presented while in an unsupervised learning procedure the output is not provided. Also, the MLP uses a backpropagation (BP) technique for training the network which in essence is a kind of supervised learning methods. Since, in this study we want to look at the ANN as an optimization problem, therefore, it is better to introduce the complexity in its learning. Usually, the error in ANN is measured by different criteria. One of these criteria is mean square error (MSE) that is defined by: n P

εðnÞ ¼ MSE ¼

i¼1

ðEi −Ri Þ2 n

;

ð1Þ

where Ei is the estimated value for a training vector of data i by network, and the Ri is the real value of data i and the n is the total number of available data for training.

Hidden Layers

Input

Output

2. Theoretical background on ANN 2.1. Concepts Artificial neural network (ANN) is a new tool which mimics the human's brain. Today, this tool has a variety applications in science, engineering, social science, economic and etc. Along with these applications, it has a lot of wide usage in oil industry. One of first reasons which lead the researchers to investigate more about the human's brain was its parallel computing ability that let the brain to be better than the computers. After that, the nervous systems

Fig. 1. MLP network with two hidden layers which shows the nested weight's vectors.

120

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

As mentioned, the weights in ANN should be changed in order to decrease the error and maximize the fitness. Therefore, using gradient descent, the changes in weights are: Δωji ðnÞ ¼ −ξ

∂εðnÞ Θ ðnÞ ∂χ ðnÞ i

ð1Þ

where Θi is the output of the previous neuron and ξ is called learning rate which controls the changes in weights and convergence. For example, using a large value of learning rate, it can lead the algorithm to be very fast, but it may be trapped in the local minima. Whereas, selecting a small value for this variable can cause the algorithm to be very slow and CPU demanding. The error change is depending on induced local field of χj. Therefore, one can write its derivative for an output node as: −

  ∂εðnÞ ′ ¼ εj ðnÞψ χ j ðnÞ ∂χ j ðnÞ

ð3Þ

where ψ′denotes derivate of activation function. Obviously, there are a lot of parameters which depend on each other and therefore it makes it difficult show all of the equation. But, for example, the relevant derivative is: −

n  X ∂εðnÞ ∂ε ðnÞ ′ − ¼ ψ χ j ðnÞ ωkj ðnÞ: ∂χ j ðnÞ ∂χ k ðnÞ k¼1

ð4Þ

It is clear that there is dependency in the new and previous weights; therefore, for finding the new weights in the hidden layer, the weights of previous attempt according to the derivative of activation function should be changed. That is for this reason which calls the algorithm as BP (Rumelhart et al., 1986; Simon, 1998). According to both Fig. 1 and the above formulations, it is clear that there is a huge computational burden in ANN. Therefore, very likely the network is trapped in the local minima if a lot of noisy data and local learning algorithm are available. For this reason, one should use a method by which the modeler is able to reduce the elements in ANN efficiently in such a way it can deal with a few data and also be faster. These kinds of ANNs will be explained bellow. 2.3. Modular neural network (MNN) As explained, the current MLP networks are mostly slow and suffer from massive computational costs which usually lead these networks to be trapped in the local minima and finally preventing prediction of the desired output accurately. Most of the evolutionary and artificial intelligent algorithms, particularly ANNs, are based on biological systems. Therefore, more study of those concepts can help us to improve the limitation of the current methods. According to recent researchers on the brain, it is understood that the brain is composed of three main subsets which introduce the modularity in the brain (Shepherd, 1974). Actually, the complex tasks in the brain will be decomposed into simpler ones (Rexrodt, 1981). This new finding causes the ANNs to be more flexible and close to the real applications at hand. In other words, the modularity is one of the most important factors in human and animal brains that help them to manage the very complex tasks efficiently. For example, one can see the brain as a collection of individual functions and modules which can work with each other effectively and decompose the complex problems into several simpler ones (Montcastle, 1978; Eccles et al., 1984; Edelman, 1987; Huble, 1988). Also, according to several researchers it is shown that the brain is composed of massively parallel and modular parts which relatively work independently (Edelman, 1979, 1987; Frackpwiak et al., 1997).

There are some problems in MLP networks which will be mentioned hereafter. For example, in the most of the cases the size of network is very large and there is no efficient learning algorithm and nor enough data to find the best associated weights. However, by dividing the network, one can define networks which are independent and have a simpler and smaller size rather than the MPL network. Since, the MLP networks are monolithic; therefore, an error and/or a change in these networks can propagate in and affect all parts. This problem reduces the stability of network and leads it to be very sensitive to the local variation and error in network while the network should has this ability that can reduce the undesirable fluctuation and decrease its effect. In the most of MLP networks, it is not possible or it is very cumbersome to implement a priori knowledge about the problem at hand. Therefore, the expert's ideas and an interpretational knowledge cannot be considered in those networks and make them inappropriate for data integration. In this section, to overcome the mentioned problem, a new concept of modularity is presented, but first let us explain a clear modularity in our visual system. Based on different researches, it is obvious that our visual system needs to do a lot of tasks such motion detection, and color, shape and intensity evaluation. Also, there is a central system which receives different results of different mentioned parts and combines them resulting in the final realization (Van Essen et al., 1992). Let us first present a schematic architecture and connection links in a MNN in Fig. 2. Also, one can compare Figs. 1 and 2 visually. Actually, the modularization ability of ANN can overcome the mentioned problems. In other words, MNN has the ability to have different structures in itself and even one can integrate a priori knowledge within it. Also, since the complex task in MNN is decomposed into several smaller and simpler ones, therefore, one can expect an overall network with a smaller complexity and CPU demanding. One of the reasons is due to using a smaller part of data for each module. According to the above definitions and explanations, we can define the MNN as a network in which the massive computational burden is divided into some modules which each of them has distinct inputs and are independent to other modules on that network (Happel and Murre, 1994; Azam, 2000). Finally, the outputs of each module will be integrated to make the final output. Therefore, each part of MNNs does a special computational task of whole system and is independent of other modules and the other one cannot influence the work of rest modules. Also, this network has simpler structures while compared with MLP and therefore, can response to input much faster. Let's return to Fig. 2 in which a MNN is presented. In this figure, it is clear that there are a few number of connections and weights, therefore, the network size will be decreased dramatically. Consequently, the complexity of network will be decreased and in this case, finding the global minima is in a smaller time would be much easier. Also, as another result, due to low complexity, we can use a

Fig. 2. An example that represents the architecture of MNN schematically.

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

smaller dataset which is one of the main features of petroleum datasets (Feldman and Ballard, 1982; Jacobs et al., 1991; Jacobs, 1995). Also, MNN has several other advantages which will be more discussed in Section 4. 3. Case study The current case study is located in the Persian Gulf, Iranian offshore. According to the obtained geological setting and the available data, this reservoir can be classified as a structural and heterogonous reservoir. The interpreted sections and data show that due to high heterogeneity in this reservoir the traditional methods are not able to predict the petrophysical parameters effectively. In this oil field, a lot of wells have been drilled in which several petrophysical parameters have been acquired. But, all of them are not suitable for this study. Therefore, among all of the available data, we selected the following parameters: Spectral Gamma Ray (SGR) which can be used as a valid indicator for shale regions, Electrical Resistivity (Rt) and Water Saturation (Sw) which can show the permeable region finely. Also, due to the observed correlation between the porosity and permeability in the most of the reservoirs, Total Porosity (φt) and Secondary Porosity (φs) were selected. Furthermore, due the obvious effect of lithology for reservoir's attributes prediction, we used sandstone (Sand), dolomite (Dol) and shale (Shale). Finally, for considering the vertical indicator and overburden pressure, the depth of each of the cores was selected. Also, it should be noted among all of the drilled wells in this oil field, just five of them intersect the reservoir. Therefore, we selected four of them which have a reasonable distance with each other. Actually, the other well is so far from the selected wells and therefore we did not account it for modeling.

121

variables, secondly is shows that the input–output relationships are independent and therefore there is no variable which has a strong relationship with permeability. Furthermore, a depth matching before giving the data network was done. All of the available dataset was divided randomly into three distinct subsets consisting of training, validation and testing. The proportions of each of these subsets were 70%, 15% and 15% of whole of the data. It is worthy to note that since the wells location are arranged in a triangular shape, the location of the wells does not allow to use one of them as a blind well. However, ANNs and in general most of the estimators are not good for extrapolation. Therefore, we used 15% of whole data for testing. In this study two criteria were used to stop the training phase, minimum MSE and validation data. In the first criterion, the training will be stopped if its MSE reaches a predefined MSE. Also, based on the second criterion (validation dataset) and for preventing form over-fitting, when the error of validation dataset was increased, then the training will be stopped. In other words, the validation dataset is given to network along with its training to ensure that the network is not memorizing the training data (generalizing) and is able to predict the other data finely. Therefore, anywhere in training phase when validation error is increased, there is no need to continue the training. Due to intrinsic dispersion of dataset and in order to help the network to be converged faster, the available data was normalized into [0 1] (Asadisaghandi and Tahmasebi, in press). Also, for entire data processing and NN modeling we used MATLAB (R2011a). Besides, since the proposed network is consisting of different parallel and small elements, therefore, there is no need to consider different independent network. Based on the study the cumulative percent of permeability, we used the threshold of 0.003 for defining the low and high permeability zones in the normalized data in [0 1].

4. Results and discussion 4.1. Data analysis

4.2. Modeling

The mentioned data in Section 3 were organized into input (SGR, Rt, Sw, φt, φs, Sand, Dol, Shale) and output (K). Since the available dataset is belonging to a different metric size, therefore, we unified them with the same length of 15 cm. The main statistical descriptions of the used data are summarized in Table 1. According to provided parameters for permeability distribution in this study in Table 1, it is clear that this reservoir shows a high degree of heterogeneity. Also, it is possible to demonstrate the heterogeneity by obtaining the correlations between the permeability and other input parameters. According to obtained results, there is a very poor correlation between the input and outputs. For example, the correlations between of Rt and Shale with permeability is 0.14 and 0.08, respectively. Also, the correlation between of SGR and permeability is 0.007. Therefore, two results can be concluded; first of all it shows that the current available information in this reservoir is very complex and it is difficult to find a relationship between the

4.2.1. MLP MLP is one of the most prevalent architectures in ANN. This architecture has several advantages (Bishop, 1995). Therefore, in this paper in order to compare the proposed network (MNN), we tested this network. The input and output for entire of this study were the same. Usually, finding an appropriate network in this architecture is based on a trial and error method. In other words, one should test different neurons with different layers. Then, based on the training and testing error, it is possible to reach the optimal structure. But, this procedure needs a tremendous time and there is no guaranty to find the best network through this method. However, finding the optimal or near-optimal network can be achieved by a systematic methodology. Asadisaghandi and Tahmasebi (in press) proposed a comprehensive methodology by which finding the optimal architecture is possible. Therefore, in this study we used that methodology and the results are summarized in Table 2.

Table 1 The statistical description of the used data in this study for both input and output.

Depth (m) Rt (g/cc) Sw (%) Phit (%) Phisec (%) Dolomite (%) Sand (%) Shale (%) SGR (gapi) K (md)

Max.

Min.

Mean

St. dev.

Median

Q (1)

Q (2)

Q (3)

2873 62.34 0.498 0.262 0.13 0.86 1 0.44 94.54 98.32

2724.75 2.27 0.07 0.01 0 0 0 0 17.54 0.55

2792.75 14.01 0.29 0.15 0.01 0.23 0.34 0.07 42.59 7.12

33.28 12.02 0.11 0.05 0.02 0.29 0.36 0.08 13.74 16.46

2786.38 8.69 0.31 0.16 0 0.04 0.16 0.06 42.25 0.73

2768.62 4.83 0.18 0.12 – 0 0 0 33.52 0.61

2786.37 8.69 0.31 0.16 0 0.04 0.16 0.06 42.25 0.73

2820.12 22.76 0.39 0.03 0.02 0.42 0.70 0.07 7.54 2.45

122

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

Table 2 The results of applying different MLP architecture for permeability estimation. The best obtained network is in bold. Number of network

Number of neurons in hidden layer

Correlation coefficient

MSE

1 2 3 4 5 6 7 8 9 10

5 7 9 12 14 17 20 23 25 28

0.87 0.91 0.92 0.93 0.94 0.85 0.79 0.82 0.71 0.70

0.0143 0.0098 0.0091 0.0083 0.0077 0.0175 0.0252 0.0214 0.0148 0.0348

In this study and for this section, we used Levenberg–Marquardt (LM) and tansig for training and output function of NN, respectively. Also, based on the presented methodology in Asadisaghandi and Tahmasebi (in press), several hidden layers with different neurons were tested. Furthermore, different normalization schemes were tested which the range of [0 1] presented the better results. According to Table 2, the best architecture for MLP was found to be 9-14-1. Actually, this architecture yields the minimum MSE and maximum correlation coefficient. Also, as mentioned, 15% of available dataset was devoted for testing. The MSE and correlation coefficient for testing dataset were 0.0121 and 0.94, respectively. Obviously, the desired values for MSE and correlation coefficient are 0 and 1, respectively. It should be remembered that using a lot of neurons in hidden layer may decrease the error in training phase which mainly is because of memorizing the data by neurons, while it is possible that the networks missed its generalization. Therefore, using a smaller network is beneficial in both CPU demanding and generalization. Furthermore, using a lot of neurons in hidden layer causes several local minima in error surface. In other words, by increasing the neurons, the number of variables in network's objective function will be increased dramatically. Finally, using an appropriate structure of NN is necessary. Another problem which mostly the oil industry is involved with is the lack of data. For example, in this reservoir, several wells have been drilled, but just a few of them have the experimental results for permeability. In other words, this issue mainly is due to lack of financial and time constraints of all of the projects. Therefore, developing a method which can deal with these conditions is necessary. Also, the method should have this ability to capture the complexity and still provide an acceptable accuracy. Below, we present an efficient method which can both deal with small dataset and decreases the CPU time. 4.2.2. MNN All of the mentioned problems in the previous section lead us to be narrow in the brain system and find a solution which can be able to overcome those problems. Commonly, the process and amount of computation in the human's brain are tremendous. But, the question is that how and by what means the brain is able to do these huge computations? For example, how does the visual system work? Because, if we want to do a simple pattern segmentation or edge detection, it takes so much time to do it through the computational method, but our brain can do it much faster. Therefore, the brain should use a mechanism by which it is able to interact to exterior response quickly. Maybe, we can summarize the answers of all of the above questions in a modularity concept. This concept leads the brain to be fast, accurate and independent to other operators. Actually, the brain uses a divide and conquer concept by which one is able to divide a problem among some subsets in order to decrease the

computational burden and the problem will be simpler. Therefore, in this study the aim is to reduce the associated problem in permeability estimation by dividing the domain into different simpler zones and finally help the networks to increase its performance and reduce the CPU time. Based on the above idea, we try to test four different architectures of MNN. The main aim of this design is to reduce the size of network in such a way it causes to reduce the complexity and increase the network's efficiency. In other words, by reducing the complexity of network we indirectly omit the involved variables in its objective function. Statistically speaking, the aim of MNN similar to some dimensional reduction techniques such as principle component analysis (PCA) and factor analysis (FA) is to reduce the complexity and variability of NN. Along this simplification, another aim of MNN is to overcome on a situation when a few data is available (Tahmasebi and Hezarkhani, 2011). Based on the presented equations in Section 2.2, several parameters affected the final results of NN, for example, the number variables which are used as input. The input reduction can be done simply. One can use a scatter-plot for mapping the inputs with output and based on each individual for input–output it is possible to maintain or omit an input. Another solution is to use the PCA to reduce the associated complexity in inputs which is used frequently, but another parameter which has a significant effect on objective function is the weights. Actually, reducing the weights effectively can make an error surface with a few numbers of local minima. Therefore, variable reduction of objective function can be beneficial in both aspects of CPU time and accuracy of results. Therefore, if the size of network reduced effectively, consequently we need a few data to achieve the better results. The main objective of MNN is finding an architecture by which and according to both training and testing results, the CPU time and performance have reduced and increased, respectively. Therefore, the strategy of adding-eliminating of weights or hidden neurons with different connections and topologies was utilized. Consequently, three possible conditions can occur: a) if MSE or the correlation coefficient was not changed and network's performance was constant, therefore, the weight vector will be omitted/added, b) if deleting/adding a weight vector causes to decrease the network's performance, the weight should not be omitted/added and (c) if by removing/adding a weight vector an improvement was observed, that vector will be added/omitted. Therefore, the above procedure in a trial and error framework will be continued until no improvement can be observed. In this paper, four different architectures using the explained procedure were constructed which are demonstrated bellow. 1) First MNN The first used MNN's architecture is shown in Fig. 3. This topology is the simplest network which is used in MNN. Such the MLP network, all of the inputs, transfer functions, inputs, output and data

Fig. 3. A schematic architecture of the first used MNN.

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

Fig. 6. A schematic architecture of the fourth used MNN.

Fig. 4. A schematic architecture of the second used MNN.

correlation coefficient are obtained to be 0.0027 and 0.99, respectively. Therefore, this architecture could have excellent results which mainly are because of using an appropriate weight vectors. Another comparison between the different architectures of MNN and MLP can be between the required epochs for reaching the network to a stable variation of MSE. In other words, one can compare the MSE for each epoch in order to find out the performance of different networks to adjust their weights. This comparison can be seen in Fig. 7. According to Fig. 7, it is obvious that different MNNs demand less time to be convergent. This improvement in aspect of CPU time is because of the using a less weight vector which it reduces the network's complexity. Therefore, the applied learning algorithm in the case of a few weights and consequently the variables, can find the global minima faster. The summery of obtained correlation coefficients and predicted permeability for different described networks are shown in Figs. 8 and 9, respectively. According to both figures, there are excellent agreements between the real and estimated permeabilities for the last obtained MNN network. Furthermore, another reason which lets the MNN's architectures to be faster is their ability to encounter with a small dataset. Actually, the independent elements in MNN lead the network sections to work separately and then in this case the results of a section have no great effect on the other sections. In the other hand, working independently can be interpreted as a parallelization scheme that finally accelerates the overall speed (Azam, 2000). Therefore, according to the obtained results the advantages of MNN over the MLP can be summarized as follows. The first one and the most important advantage is reduction of model complexity. Actually, more increase in understudy phenomena, more increases the complexity in MLP networks which mainly is because of increasing the number of weight 0.05 MLP

0.045

MSE

ranges are the same. The only difference is the used architecture which leads to different weight vectors and consequently the responses. According to the obtained results, this architecture could not capture the available complexity and spatial relationship between the inputs and permeability. The MSE and correlation coefficient were 0.022 and 0.72, respectively. Therefore, the number of weights should be increased in order the objective function be more flexible and can capture and model the variability. For this aim, in the next architecture, the number of weight connections and neurons links will be increased. 2) Second MNN In this step, the number of weight links is increased which can be seen in Fig. 4. According to this architecture, a middle weight vector is added and the MSE and correlation coefficient were changed to 0.015 and 0.81, respectively. Again, the results indicate that the weight links are not very appropriate and they some new links should be added to recover the missed generalization of ANN. Therefore, in the next section a different architecture will be test. 3) Third MNN The third architecture for MNN is shown in Fig. 5. This figure presents a new architecture which connects first layer to output function. As it is clear, this new architecture presents a more flexible topology which can better capture the spatial relationships. Therefore, as expected, the network's performance is in increased. This network, reports 0.96 and 0.0063 for correlation coefficient and MSE, respectively. It seems that the architecture of networks is converging. Therefore, in the next step a network which is a mixture of the previous networks will be presented. 4) Fourth MNN The fourth MNN architecture is presented in Fig. 6. It is expected that this network provides less error and higher performance rather the other MNN networks. Also, the experimental results indicate that the fourth network due to its fine and effective links between the elements has a small error. The MSE and

123

0.04

1st MNN

0.035

2st MNN

0.03

3st MNN 4st MNN

0.025 0.02 0.015 0.01 0.005 0 0

2000

4000

6000

8000

10000 12000 14000

Epoch

Fig. 5. A schematic architecture of the third used MNN.

Fig. 7. Comparison of MLP and different MNN networks in both accuracy and convergence speed.

124

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

Estimated Permeability

a

1

R = 0.94 0.8 0.6 0.4 0.2 0 0

0.2

0.4

0.6

0.8

1

Real Permeability

c

1

R = 0.72

Estimated Permeability

Estimated Permeability

b

0.8 0.6 0.4 0.2 0

1

R= 0.81 0.8 0.6 0.4 0.2 0

0

0.2

0.4

0.6

0.8

1

0

0.2

Real Permeability

e

1

R = 0.96

Estimated Permeability

Estimated Permeability

d

0.4

0.6

0.8

1

0.8

1

Real Permeability

0.8 0.6 0.4 0.2 0

1

R= 0.99 0.8 0.6 0.4 0.2 0

0

0.2

0.4

0.6

0.8

1

0

0.2

Real Permeability

0.4

0.6

Real Permeability

Fig. 8. Comparison of the real and estimated permeability with their corresponded correlation coefficient for (a) MLP, (b) 1st MNN, (c) 2st MNN, (d) 3st MNN and (e) 4st MNN networks.

1 0.9 0.8

Permeability

0.7 0.6 0.5 0.4 0.3 0.2 Real Permeability 1st MNN 3st MNN

0.1

MLP 2st MNN 4st MNN

0 0

5

10

15

20

25

30

Sample No. Fig. 9. Comparison of the estimated values of different networks and the real permeability.

35

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

which should be optimized. But, in MNN the complexity is broken down into some simpler cases. Consequently, the nested and fully connected elements in MLP lead them to be unstable and sensitive to variation, since due to independent architecture of MNN, the results of a section have not affected the other parts and if any problem happened in a section, the other parts will be affected by the produced error. Another advantage is due to flexibility of MNN which lets them simulate and use different source data simultaneously while in traditional MLP one should separate different data which show different behavior. For example, in MLP it is difficult to construct a network by which one is able to model very low and very high permeability regions simultaneously while in MNN by specifying each region to a module, the whole field permeability can be simulated. Furthermore, due to independent modules in MNN, it is possible to consider both supervised and unsupervised learning algorithms for different parts which increases the flexibility of MNN more when compared with MLP. Computation efficiency is another advantage of MNN. The reason of this superiority is because of breaking down the initial network to smaller and simpler parts which finally the computational burden will be decreased dramatically. As another advantage, the MNN network has the ability of integrating different datasets. This benefit can be achieved by using different datasets in different modules, different learning algorithms, different functions and different tasks that should be modeled. Therefore, this ability leads the MNN to increase its learning ability tremendously. Also, the ability of independent essence of MNN does not let us modify the network for a new dataset and simply adds the new dataset to network while in MLP one should rearrange the whole network for a new dataset. In other words, in this situation just the corresponded module will be modified and there is no need to refine the entire network. Also, this structure leads us to improve each part of the network individually while in MLP networks the result of any changes in a particular parameter will be propagated to the entire network. Therefore, a meaningful representation for each part can be done easily. 5. Conclusion and future works Due to importance of permeability distribution, in this paper we tried to develop a new methodology for ANN. The modularity was the concept which was used extensively in this study. Actually, by using this new concept one can achieve a lot of advantages. For instance, this network can use a lot different data sources and define different paths for each of them independently. The main idea was based on the performance of the brain in which it is able to process the massive computational tasks very fast and accurately (e.g. the visual system). By using the same procedure existing in the brain (modularity) and using it in this study, it was possible to define different parallel networks. For this aim and in order to compare the results in performance and accuracy, this new methodology was compared with traditional MLP methods. For MNN, four different architectures were tested. Each of them benefits from very low computational cost and highly independent structure that let them perform finely. On the other hand, the obtained results for permeability prediction showed that incorporating this new idea for biological systems can be very efficient. The major advantages of this methodology and in addition the MNN were their ability to reduce the associated complexity in reservoir, great stability to different situations, using different properties for its different modules (such learning algorithms, performance functions, input data, neurons), very low computational cost, high learning ability by breaking down the problem into several simpler tasks, more flexibility for new dataset and finally these networks have a great ability to use a priori knowledge or secondary data in their architecture. For example, in this study the whole of the reservoir was presented to

125

network and it leads the final model to be smaller and faster and it was no need to define large different networks for all of the parts. In addition, this global network can integrate different networks efficiently. Therefore, by using all of the mentioned advantages, one can easily model the problem at hand. That is why in this paper the R 2 was changed from 0.94 to 0.99 for MLP and MNN networks, respectively that shows good ability of efficiency. Furthermore, the new architecture took a less time rather than the common networks. Using this methodology is not specified for permeability or porosity prediction. One can use the modularity concept in other complex engineering and science problems. As the future work, one can test the ability of this method when different datasets with different properties are available in the reservoir. Obviously, the method should be able to model the complexity and integrate the different datasets. Therefore, data integration for very complex problems should be easier by using this method. In other words, one can easily use this network to integrate different data sources. For example, in a path it is possible to use the seismic and in another one can use the well data to integrate them successfully. Another useful work that can be done is to find an automatic procedure to design the MNN's architecture, because, the obtained architectures in this study were mainly based the user's experience and trial-and-error method. Also, incorporating the evolutionary methods such genetic algorithms and simulated annealing with MNN can be very interesting. References Aminian, K., Bilgesu, H.I., Ameri, S., Gil, E., 2000. Improving the simulation of waterflood performance with the use of neural networks. SPE 65630, Proceeding of SPE Eastern Regional Conference, October. Aminian, K., Thomas, B., Bilgesu, H.I., Ameri, S., Oyerokun, A., 2001. Permeability distribution prediction. SPE Paper, Proceeding of SPE Eastern Regional Conference, October. Asadisaghandi, J., Tahmasebi, P., 2011. Comparative evaluation of back-propagation neural network learning algorithms and empirical correlations for prediction of oil PVT properties in Iran oilfields. J. Petrol. Sci. Eng. 78 (2), 464–475. Azam F., 2000. Biologically Inspired Modular Neural Networks. PhD Dissertation, Virginia Tech. Bean, M., Jutten, C., 2000. Neural networks in geophysical applications. Geophysics 65, 1032–1047. Bishop, C., 1995. Neural Network for Pattern Recognition. Clarendon Press, Oxford. Eccles, J.C., Jones, E.G., Peters, A., 1984. The cerebral neocortex: atheory of its operation. Cerebral Cortex: Functional Properties of Cortical Cells, Vol. 2. Plenum Press. Edelman, G.M., 1979. Group selection and phasic reentrant signaling: a theory of higher brain function. In: Schmitt, F.O., Worden, F.G. (Eds.), The Neurosciences: Fourth study Program. MIT Press, Cambridge, MA. Edelman, G.M., 1987. Neural Darwinism: Theory of Neural Group Selection. Basic Books. Feldman, J.A., Ballard, D.H., 1982. Connectionist models and their properties. Cogn. Sci. 6 (3). Frackpwiak, R.S.J., Friston, K.J., Frith, C.D., Dolan, R.J., Mazziotta, J.C., 1997. Human Brain Function. Academic Press, San Diego. Ghezelayagh, H., Lee, K.Y., 1999. Training neuro-fuzzy boiler identifier with genetic algorithm and error backpropagation. IEEE Pow. Eng. Soc. Summer Meet. 2, 978–982. Gorzalczany, M.B., Gradzki, P.A., 2000. Neuro-fuzzy-genetic classifier for technical applications. Proc. IEEE Int. Conf. Ind. Technol. 1, 503–508. Happel, B., Murre, J., 1994. The design and evolution of modular neural network architectures. Neural Networks 7, 985–1004. Huble, D.H., 1988. Eye, Brain, and Vision. Scientific American Library, New York. Jacobs, R.A., 1995. Methods of combining experts' probability assessments. Neural Comput. 7, 867–888. Jacobs, R.A., Jordan, M.I., Nowlan, S.J., Hinton, G.E., 1991. Adaptive mixtures of local experts. Neural Comput. 3 (1), 79–87. Jagielska, I., Matthews, C., Whitfort, T., 1999. An investigation into the application of neural networks, fuzzy logic, genetic algorithms, and rough sets to automated knowledge acquisition for classification problems. Neurocomposites 24, 37–54. Karimpouli, S., Fathianpour, N., Roohi, J., 2010. A new approach to improve neural networks' algorithm in permeability prediction of petroleum reservoirs using supervised committee machine neural network (SCMNN). J. Petrol. Sci. Eng. 73, 227–232. Mohaghegh, S., 1994. Artificial Neural Networks as a Valuable Tool for Petroleum Engineers. SPE 29220. Mohaghegh, S., Gaskari, R., Popa, A., Ameri, S., Wolhart, S., 2001. Identifying best practices in hydraulic fracturing Using virtual intelligence techniques, SPE 72385. Proceedings, 2001 SPE Eastern Regional Conference and Exhibition, October 17-19, North Canton, Ohio.

126

P. Tahmasebi, A. Hezarkhani / Journal of Petroleum Science and Engineering 86–87 (2012) 118–126

Montcastle, V.B., 1978. An organizing principle for cerebral function: the unit module and the distributed system. In: Edelman, G.M., Mountcatke, V.B. (Eds.), The Mindful Brain: Cortical Organization and the Group Selective Theory of Higher Brain Function. MIT Press, Cambridge, MA, p. 7. Nikravesh, M., 2004. Soft computing-based computational intelligent for reservoir characterization. Exp. Syst. Appl. 26, 19–38. Rexrodt, F.W., 1981. Gehrin und Psyche. Hippokrates, Stuttgart, Germany. Rumelhart, D.E., Hinton, G.E., Williams, R.J., 1986. Learning representations by backpropagating errors. Nature 323 (6088), 533–536. doi:10.1038/323533a0. Saemi, M., Ahmadi, M., 2008. Integration of genetic algorithm and a coactive neurofuzzy inference system for permeability prediction from well logs data. Trans. Porous Med. 71, 273–288. Saemi, M., Ahmadi, M., Varjani, Y.A., 2007. Design of neural networks using genetic algorithm for the permeability estimation of the reservoir. J. Petrol. Sci. Eng. 59, 97–105.

Sahimi, M., 2000. Fractal-wavelet neural-network approach to characterization and upscaling of fractured reservoirs. Comput. Geosci. 26 (877–905), 32. Shepherd, G.M., 1974. The Synaptic Organization of the Brain. Oxford University Press, New York. Simon, H., 1998. Neural Networks: A Comprehensive Foundation, 2 ed. Prentice Hall0132733501. Tahmasebi, P., Hezarkhani, A., 2010a. Application of adaptive neuro-fuzzy inference system for grade estimation; case study, Sungun porphyry copper deposit, East Azerbaijan. Aust. J. Basic Appl. Sci. 4 (3), 408–420. Tahmasebi, P., Hezarkhani, A., 2010b. Comparison of optimized neural network with fuzzy logic for ore grade estimation. Aust. J. Basic Appl. Sci. 4 (5), 764–772. Tahmasebi, P., Hezarkhani, A., 2011. Application of a modular feedforward for grade estimation. Nat. Resour. Res. 20 (1), 25–32. doi:10.1007/s11053-011-9135-3. Van Essen, D.C., Anderson, C.H., Fellman, D.J., 1992. Information processing in the primate visual system. Science 255, 419–423.