TSINGHUA SCIENCE AND TECHNOLOGY ISSNll1007-0214ll05/17llpp281-287 Volume 15, Number 3, June 2010
Approximating Nonlinear Relations Between Susceptibility and Magnetic Contents in Rocks Using Neural Networks William W Guo**, Michael Li, Zhengxiang Li†, Greg Whymark Faculty of Arts, Business, Informatics and Education, Central Queensland University, Rockhampton QLD 4701, Australia; † The Institute for Geoscience Research (TIGeR), Curtin University of Technology, Perth WA 6845, Australia Abstract: Correlations between magnetic susceptibility and contents of magnetic minerals in rocks are important in interpreting magnetic anomalies in geophysical exploration and understanding magnetic behaviors of rocks in rock magnetism studies. Previous studies were focused on describing such correlations using a sole expression or a set of expressions through statistical analysis. In this paper, we use neural network techniques to approximate the nonlinear relations between susceptibility and magnetite and/or hematite contents in rocks. This is the first time that neural networks are used for such study in rock magnetism and magnetic petrophysics. Three multilayer perceptrons are trained for producing the best possible estimation on susceptibility based on magnetic contents. These trained models are capable of producing accurate mappings between susceptibility and magnetite and/or hematite contents in rocks. This approach opens a new way of quantitative simulation using neural networks in rock magnetism and petrophysical research and applications. Key words: neural networks; nonlinear function approximation; rock magnetism; magnetic susceptibility; magnetic contents
Introduction Magnetism of a rock depends on the magnetic minerals that the rock contains. Usually magnetic properties of a rock are determined by ferromagnetic minerals if they are present. Magnetite and hematite are the most common ferromagnetic minerals in rocks. For example, magnetite and hematite often form some 5% by weight of igneous and metamorphic rocks, and are present in many sedimentary rocks with various fractions[1]. Although many factors, such as grain size of magnetic minerals, may affect magnetic properties of a rock, the content of magnetic minerals in a rock is the predominate factor. Received: 2010-04-13; revised: 2010-05-08
** To whom correspondence should be addressed. E-mail:
[email protected]; Tel: 61-7-49309687
Great effort has been made in understanding the relations between magnetic properties (particularly magnetic susceptibility) and the content of magnetite or hematite for the purposes of interpreting magnetic anomalies[2-5] and rock magnetism study[1], and a few statistical correlations between susceptibility and magnetite content have been reported in some of these studies. However, most results of magnetite content used in these early studies were determined by magnetic separation plus chemical analysis[2-4] or microscopic grain counting[5]. These methods used in these early studies, compared with the lately used analysis techniques, are far less accurate. Currently, the more accurate X-ray diffraction (XRD) analysis is widely used to determine magnetic contents in rocks. Therefore, to couple magnetic susceptibility with magnetic contents obtained from XRD analysis, there is a need to establish new correlations between susceptibility
282
and weight percentage of magnetite and/or hematite in rocks. In practice, susceptibility data are relatively easier and cheaper to obtain in the laboratory. However, XRD analysis is more expensive and thus is only applied to some selected samples. Making XRD data availability even worse is that most XRD analysis is financially supported by industry partners who usually impose some restrictions on data release. This could partly be the reason why only a very few new studies in this area have been reported in the last two decades[6,7]. The magnetic susceptibility and XRD data used in this study were collected in a petrophysical study partly supported by three mining companies during 19951999[6]. This study showed that for rocks containing magnetite less than 0.5% by weight, an exponential correlation exists between susceptibility and weight percent of hematite. Magnetite dominates susceptibility of rocks in power law if the rocks contain magnetite higher than 0.5% by weight. These data were reanalyzed using a statistical data mining approach[7], which revealed that the nonlinearity of correlation between susceptibility and magnetite could not be sufficiently described using one single expression; instead, segmentation-based multiple fittings seemed more useful. This complexity of nonlinear relations between susceptibility and magnetic contents implies that new strategies, different from the traditional philosophy that a correlation could be described by a sole expression or a set of expressions through statistical analysis, must be adopted in dealing with the approximation of such nonlinearity. In this paper, neural network techniques are used for approximating nonlinear relations between magnetic susceptibility and contents of magnetic minerals in rocks. This study focuses on training a dynamic neural network for producing the best possible estimation on susceptibility with respect to the given magnetic contents, rather than finding a general expression for describing such a correlation. This is an innovative application of neural networks in rock magnetism and magnetic petrophysics. In the following sections, we briefly introduce the process of collection, preprocessing, and classification of the rock magnetic data used for our study, and then present the mathematical description of this nonlinear problem. Following is the outline of the processes for neural network model selection, training, and testing
Tsinghua Science and Technology, June 2010, 15(3): 281-287
using these classified data. At last discussion and conclusion are drawn based on the outcomes of the neural network simulation.
1
Data Collection, Preprocessing, and Classification
A total of 573 rock samples were collected from 114 sites in the northwest of Western Australia. Typically six to nine standard specimens were extracted from each rock sample. Magnetic susceptibility of all specimens was measured individually, which returns more than 3000 susceptibility datasets. Since XRD analysis is expensive, only 43 composite samples were selected for XRD analysis. Each composite sample was made of a few small rock pieces taken from the same rock sample so as to minimize the bias to a particular specimen. This is necessary because magnetic minerals are normally distributed unevenly in a rock and even in a rock sample. The composite sample was then crushed to fine grains for XRD analysis. Both magnetic susceptibility measurement and XRD analysis were carried out in the laboratories at The University of Western Australia. Magnetite and hematite were found to be the main carriers of magnetism in these samples. Nearly 300 susceptibility measurements can be mapped to the 43 XRD samples due to the fact that multiple specimens can be mapped to the same rock sample. These one-to-many mappings must be rationalized before data analysis takes place. The simplest way to do so is to use the average susceptibility from all specimens taken from the same rock sample. This should give 43 one-to-one mappings between magnetic content and susceptibility. These 43 mappings are probably sufficient for statistical data analysis, which were indeed used in the previous studies[6,7]. It is obvious that these 43 mappings are not sufficient for training a reliable neural network. Instead of directly using the average susceptibility from all specimens of the same rock sample, the magnetic contents from an XRD sample can be redistributed to individual specimen through its susceptibility based on the following formula sij c j cij (1) sj where cj is the magnetic content of j-th XRD sample; sj
283
William W Guo et al.ġApproximating Nonlinear Magnetic Relations …
is the average susceptibility of all specimens from the j-th corresponding rock sample; sij is the susceptibility of the i-th specimen from the j-th rock sample; cij is the redistributed magnetic content to the i-th specimen from the j-th XRD sample. Such redistribution produces about 300 new mappings. These new mappings are further classified into three subclasses: magnetite-susceptibility (Mag-Sus), hematite-susceptibility (Hem-Sus) whilst magnetite is less than 0.5%, and magnetite-hematite-susceptibility (Mag/Hem-Sus) whilst both magnetite and hematite are present. This classification is based on the findings of the previous statistical analysis[6]. A summary of these subclasses is shown in Table 1. The splitting of these mappings for both neural network training and testing is kept at a ratio of about 83% to 17%. Table 1 Classifications of susceptibility and magnetic content mappings Subclass Mag-Sus
2
Mappings
Training size
Testing size
239
202 (82%)
37 (18%)
Hem-Sus
268
231 (84%)
37 (16%)
Mag/Hem-Sus
144
123 (83%)
21 (17%)
Problem Description
Previous studies have revealed some correlations between magnetic contents and susceptibility in rocks and ores[2-4,6,7], but they only indicate the general trends between susceptibility and magnetic contents. In summary, the following knowledge has been discovered in these studies. A power law seems to exist between susceptibility (s) and magnetite content (m) when magnetite is higher than 0.5% by weight in rocks[2-4,6,7], i.e., s amb (2) where a and b are statistical constants depending on datasets used. For rocks containing magnetite less than 0.5% by weight, an exponential correlation seems to exist between susceptibility and hematite (h) content[6], i.e., s cd h (3) where c and d are statistical constants depending on datasets used. Since magnetite and hematite are the most common and persistent magnetic minerals in rocks, susceptibility of rocks is largely determined by the composition of these two minerals contained. For a general case
where both magnetite and hematite are present in rocks, a logical inference for such a relation would lead to the following expression: s amb cd h ef (m, h) (4) where e is another data-dependable constant and f(m, h) is an unknown function depending on both magnetite and hematite contents. These relations are all nonlinear and data-dependable functions without unique solutions. The best effort to get a usable solution is through approximation using a collection of data for a particular case. Statistical approximation has been proven too course for the purpose of simulation in magnetic petrophysics[6,7] so new approaches are needed to achieve a better approximation for such purpose.
3 Approximating Nonlinear Relations by Neural Networks 3.1
Neural network model selection
The core of a neural network is actually an adaptive mathematical model that is capable of approximating any arbitrary unknown function constrained by training datasets. It has been proven that a three-layer multilayer perceptron (MLP) neural network can approximate any continuous function mapped from one finite-dimensional space to another by adjusting the number of nodes in the hidden layer[8]. The structure of a three-layer MLP with a hidden layer of L nodes, a p-dimensional input vector x, and a q-dimensional output vector y is illustrated in Fig. 1.
Fig. 1 Three-layer MLP
The relationship between the input and output components for this MLP can be generally expressed as § L · (5) yk I ¨ ¦ w2, kj\ ¦ w1, ji xi ¸ ©j1 ¹ where M and \ are the transfer functions; w1, ji denotes the input-to-hidden layer weights at the hidden neuron
284
j; and w2, kj is the hidden-to-output layer weights at the output unit k. Despite this generalized formula, the outcome of the MLPs is a set of numerical values, rather than an analytical formula like that resulting from statistical analysis should it exist. If the network is well trained, the MLP only returns the closest approximated values in response to new input data. For our problem, the single output of such an MLP is obviously susceptibility, but the input varies with different magnetic minerals. We will use not only a single input vector of either magnetite or hematite for approximating 2-D correlations between susceptibility and either mineral, but also two input vectors of both magnetite and hematite for approaching 3-D correlation between both minerals and susceptibility that has not been reported in the world by now. Although a single hidden layer is technically sufficient for achieving satisfactory approximation[8,9], there has been no universal rule for selecting the number of nodes in the hidden layer even though some simple rules of thumb have been proposed[10]. Hence such selection is determined by running a number of experiments for individual cases, or the practitioner’s strategy[11]. 3.2
Neural network training
The neural network process involves two phases: training the network with known datasets and testing the trained network using different known datasets for model generalization. Normally a performance function is used to control the network training process. A good reference to performance functions commonly used for controlling neural network training is given by Qi and Zhang[12]. Mean square error (MSE) is chosen as the performance function to control the process of neural network training in this study 1 N MSE ( so (t ) ss (t ))2 (6) ¦ N t1
where so and ss are original and simulated values, respectively. We also choose the Levenberg-Marquardt (LM) algorithm[13] to train the selected MLPs because this algorithm has been reported to be the fastest method for training moderate-sized feedforward neural networks[14,15]. For the LM algorithm, weights (w) are
Tsinghua Science and Technology, June 2010, 15(3): 281-287
updated according to the following formula wij (t+1) = wij(t) + 'wij(t) (7) with wij = (J TJ + I)1J Te (8) where J is the Jacobian matrix containing first derivatives of the network errors with respect to the weights, and e is a vector of network errors. The LM algorithm was designed to approach second-order training speed without having to compute the Hessian matrix H=J TJ (9) Therefore, it is faster than Newton’s and the gradient methods in computing. A detailed description of the LM algorithm can be found in Marquardt[13], Hagan and Menhaj[14], and Hagan et al.[15] 3.3
Training and testing results
The datasets are split randomly into training and testing subsets which are approximately at a ratio of 83% to 17% in general. The details are given in Table 1. Our MLP models are built using the neural network tools in MATLAB®[16,17]. The training of three-layer MLPs is based on running a number of experiments for datasets in different subclasses. These experiments indicate that there is no significant difference between the logsig-linear and tansig-linear combinations as the transfer functions for the hidden and output layers respectively. Therefore, the tansig-linear combination is chosen as the transfer functions for our MLPs. Assuming that an MSE smaller than 0.0001 indicates a good fit being achieved, experiments using hidden layers of 25, 50, 80, 100, 150, 200, and 250 nodes show that a hidden layer with 80 nodes can achieve the target MSE within 10 epochs (Fig. 2) and produce the most balanced outcome, i.e., neither under-fit nor over-fit. Therefore, the outcomes of a hidden layer of 80 nodes will be used for our discussion later. Other neighboring MLPs also produce satisfactory outcomes that are shown in Table 2 for comparison. The 80-neuron MLPs return consistently satisfactory results for all three subclasses in terms of both the mean absolute error (MAE) and the maximum error (Max) (Table 3). Among the three subclasses, both Mag-Sus and Mag/Hem-Sus return almost a perfect correlation between the targets and simulated data whereas Hem-Sus shows a trend of underestimating the targets at the higher end (Fig. 3). These features are
285
William W Guo et al.ġApproximating Nonlinear Magnetic Relations …
demonstrated clearly in Fig. 4, in which the fittings of both Mag-Sus and Mag/Hem-Sus are intuitively perfect whereas the simulated values are mostly smaller than the targets for the Hem-Sus model.
(a) Mag-Sus
(a) Mag-Sus
(b) Hem-Sus (b) Hem-Sus
(c) Mag/Hem-Sus (c) Mag/Hem-Sus Fig. 2 Training curves for Mag-Sus (a), Hem-Sus (b) and Mag/Hem-Sus (c) subclasses with 80-node hidden-layer MLPs
Fig. 3 Linear regression between the targets (T) and simulated outcomes (S) with 80-node hidden-layer MLPs for Mag-Sus (a), Hem-Sus (b), and Mag/Hem-Sus (c) subclasses
Tsinghua Science and Technology, June 2010, 15(3): 281-287
286
4
(a)
(b)
(c) Fig. 4 Plots of targets and simulated outcomes with 80-node hidden-layer MLPs for samples of Mag-Sus (a), Hem-Sus (b), and Mag/Hem-Sus (c) subclasses Table 2 Subclass
MLP training results MSE
50 nodes
80 nodes
6.3×10
5
5.1×10
5
3.6×105
Hem-Sus
2.7×10
7
2.3×10
7
2.1×107
Mag/Hem-Sus
1.4×106
1.8×107
1.1×109
Mag-Sus
Table 3
100 nodes
Testing results of the 80-node hidden-layer MLPs
Subclass
Discussion and Conclusions
All three MLPs produce satisfactory approximations to nonlinear functions between the contents of magnetic minerals and the susceptibility in rocks. These trained neural networks are capable of producing accurate mappings between susceptibility and magnetite and/or hematite contents in rocks. Such quantitative simulation provided by the MLP models opens a new way in rock magnetism and petrophysical research and applications. This cannot be achieved by using statistical methods because the statistical correlations only offer qualitative trends between the factors[6,7]. However, the role that statistics played is still important in discovering general patterns among the relevant factors, which can be hardly provided by neural networks. For example, statistical analysis has shown that an exponential correlation seems to exist between susceptibility and hematite content for rocks containing magnetite less than 0.5% by weight[6]. Such knowledge provides a general guide for researchers to interpret and better understand some magnetic phenomena in rock magnetism even though this rule is too coarse for producing a reliable susceptibility value. On the contrary, the MLP models can produce accurate simulations, but offer no general description of the hidden nonlinear functions. For the single input MLPs, the Mag-Sus model performs much better than the Hem-Sus model. This may be attributed to the fact that magnetite is far more magnetic than any other minerals in rocks so its presence with more than 0.5% by weight will see it dominate other minerals in magnetism, particularly hematite. On the other hand, a trace of magnetite, even too small to be detected by XFD, could still make a non-negligible contribution to the magnetism of a hematitedominated rock. This brings some deflection in the training of the Hem-Sus MLP, which results in a general underestimation on susceptibility by hematite alone. Undoubtedly the best outcome is achieved by combining the two inputs together for training the neural networks.
MAE
Max
Correlation
Mag-Sus
0.0053
0.0350
0.998
Hem-Sus
0.0002
0.0008
0.901
The Commonwealth Government of Australia and The Univer-
Mag/Hem-Sus
0.0000
0.0000
1.000
sity of Western Australia are thanked for supporting this re-
Acknowledgements
search through scholarship schemes. The Hamersley Iron Pty
William W Guo et al.ġApproximating Nonlinear Magnetic Relations … Ltd, BHP Iron Ore, and Robe River Iron Association are thanked for the financial and field assistance.
References [1] Tarling D H, Hrouda F. The Magnetic Anisotropy of Rocks. London, England: Chapman & Hall, 1993. [2] Mooney H M, Bleifuss R. Magnetic susceptibility measurements in Minnesota: II, Analysis of field results. Geophysics, 1953, 18: 383-393. [3] Balsley J R, Buddington A F. Iron-titanium oxide minerals, rocks and aeromagnetic anomalies of the Adirondack area, New York. Economic Geology, 1958, 53: 777-805. [4] Jahren C E. Magnetic susceptibility of bedded iron formation. Geophysics, 1963, 28: 756-766. [5] Webb J E. The search for iron ore, Eyre Peninsula, South Australia. In: Mining Geophysics: Vol. I, Case Histories. Oklahoma, USA: Society of Exploration Geophysicists, 1966. [6] Guo W. Magnetic petrophysics and density investigations of the Hamersley Province, Western Australia: Implications for magnetic and gravity interpretation [Dissertation]. Perth, Australia: The University of Western Australia, 1999. [7] Guo W. A regression algorithm for rock magnetic data mining. WSEAS Transactions on Information Science and Applications, 2005, 2: 671-678. [8] Hornik K, Stinchcomb M, White H. Multilayer feedforward networks are universal approximators. Neural Networks, 1989, 2: 359-366.
287
[9] White H. Some asymptotic results for learning in single hidden layer feedforward network models. Journal of American Statistical Association, 1989, 84: 1008-1013. [10] Rumelhart D E, Hinton G E, Williams R J. Learning internal representations by error propagation. In: Rumelhart D E, McClelland J L, eds. Parallel Distributed Processing. Cambridge, USA: MIT Press, 1986. [11] Curry B, Morgan P H. Model selection in neural networks: Some difficulties. European Journal of Operational Research, 2006, 170: 567-577. [12] Qi M, Zhang G P. An investigation of model selection criteria for neural network time series forecasting. European Journal of Operational Research, 2001, 132: 666-680. [13] Marquardt D. An algorithm for least-squares estimation of nonlinear parameters. SIAM Journal of Applied Mathematics, 1963, 11: 431-441. [14] Hagan M T, Menhaj M. Training feedforward networks with the Marquardt algorithm. IEEE Transactions on Neural Networks, 1994, 5: 989-993. [15] Hagan M T, Demuth H B, Beale M H. Neural Network Design. Boston, USA: PWS Publishing, 1996. [16] Demuth H, Beale M. Neural Network Toolbox for Use with Matlab. Natick, USA: The MathWorks, 2004. [17] Demuth H, Beale M, Hagan M. Neural Network Toolbox 5. Natick, USA: The MathWorks, 2007.