Mapping of soil contamination by using artificial ... - Semantic Scholar

6 downloads 0 Views 441KB Size Report
Abstract. The work deals with the development and use of mixed models. (artificial neural networks-ANN and modem geostatistical models) for the analysis of ...
Mapping of Soil Contamination by Using Artificial Neural Networks and Multivariate Geostatistics M. Kanevski l, V. Demyanov ~, M. Maignan 2 l Institute of Nuclear Safety (IBRAE) B.Tulskaya 52, 113191 Moscow, Russia; [email protected] z University of Lausanne, [email protected]

Abstract. The work deals with the development and use of mixed models (artificial neural networks-ANN and modem geostatistical models) for the analysis of spatially distributed environmental data. When multivariate data have complex non-linear trends or high variability at different scales in the region of study it is proposed to use ANN to model non-linear large scale structures (trends) and then to apply multivariate geostatistics (co-kriging models) to the residuals. The proposed model is used for the spatial prediction of soil contamination by Chernobyl radionuclides.

1 Introduction There are many peculiarities making analysis and mapping of environmental data a difficult task: measurement errors, high variability at many spatial and temporal scales (e.g., in case of Chernobyl fallout from meters to hundreds of kilometres), non-linear trends, etc. Usually it is difficult or impossible to develop deterministic models based on first principles. That is why a lot of efforts are devoted to the development and improvement of approaches to the analysis, processing and presentation of environmental spatial data. 2 Problem Description and Objectives The present study is concerned with spatial analysis and spatial mapping of environmental data. The main attention is paid to multivariate data analysis when it is possible to obtain information about different kind of data. For example, data bases on soil contamination by heavy metals or radionuclides contain information about several metals. Some of them can be related/correlated each other. In the present case study (Chernobyl fallout in Russia) data bases include information about Cs137 (DB1 - 680 measurements) and Sr90 (DB2 - 286 measurements) radionuclides. Both radionuclides are important from radiological point of view. The quantity and quality of the information (number of measurements) depends on many factors: monitoring networks, man-power, time, cost, etc. Simple statistical analysis shows that variables are correlated (correlation coefficient is higher than 0.75). The basic problem is how to use additional information about Cs137 measurements in order to improve mapping of Sr90. Such kinds of problems are the main subject of multivariate geostatistics [1,2]. Geostatistics is a model-

1126

dependent approach and is based on variance minimisation and development of spatial correlation structures (variograms and cross-variograms in multivariate case). At present geostatistical models (family of kriging models) are widely used in many applications concerning spatial data analysis. An important advantage of the geostatistical models is their ability to quantify the quality of spatial predictions with the help of error maps (maps of estimation variances). Most of the geostatistical models rely on deep expert analysis (e.g., exploratory variography and modelling of spatial correlation structures) and are based on some theoretical assumptions which rarely can be found in environmental applications (e.g., secondorder stationarity; intrinsic hypothesis: mean value is constant in the region and variogram depends only on distance between points). Another model-free approach is based on artificial neural networks (ANN) algorithms and have been developed to deal with real-world problems. An important feature of ANN is their robust behaviour with noisy environmental data. Neural networks can be superior to other methods when data exhibit significant unpredictable non-linearity. At present time there is a great interest and activity in the development and application of ANN to spatial data analysis and mapping (see, for example, references in [3-6]). It is reasonable to assume that combination and use of power of model-free and model-dependent approaches can help in solving real problems of multivariate environmental spatial data mapping (spatial co-estimations). Thus, the main objective of the present investigation is to develop new mixed model and to show that combination of ANN and geostatistics can be both useful and efficient. The work deals with the development of mixed Neural Network Residual Co-kriging (NNRCK) model - non-stationary model for the spatial co-estimations of correlated variables. 3 The Methodology and Case Study The present work is an important generalisation to the multivariate case of the ideas firstly presented for the univariate ease in [6]. In short, the basic idea is to use ANN to model large scale non-linear structures (trends) and then to use geostatistical models for the analysis of residuals - modelling of small scale variations. Without loss of generality, in the present study multilayer perceptrons (MLP) are used. They are well known global function approximators and their performance is based on a well developed mathematical background. MLP develop global approximation to non-linear input-output mapping. They are capable of generalisation in regions of the input space where little or no training data are available [7], which is important in a multivariate case. 3.1 Exploratory Spatial Data Analysis and Variography Exploratory spatial data analysis deals with the following steps: batch univariate and multivariate statistical analysis, spatial moving window statistics, trend analysis, This is an important phase of the study both for the ANN and

1127

geostatistical analyses. The basic statistical parameters of the Chernobyl data are following: minimum value Cs137=0.05, Sr90=-0.018; mean value Cs137=9.16; Sr90=0.29; maximum value Cs137=97.47, Sr90=1.36; variance Cs137=98.42; Sr90=-0.052; skewness Cs137=2.7; Sr90=2.03; kurtosis Cs137=16.9, Sr90=8.I3. As usually environmental data are positively skewed and their distributions are far from normal. Concentrations are measured in Ci/sq.km. Exploratory variography - analysis of spatial correlations and cross-correlations. The most often variograms (i=j) and cross-variograms (i*:j) are used [1,2].

2? ~/(h) = E[(Zz(x+h)-Z~ (x))(Zj (x +h) -Zj (x))] where Z~, Zj - variables (e.g., Cs, St), h - separation vector between points in space (x+h) and x. In case of co-estimations this function is analysed and modelled for Cs, Sr and Cs+Sr. Cross-variogram for the raw data is presented in Fig.1. The presence of large scale trend is evident (cross-variogram goes above a sill (a priori variance) - dashed line). Variogram behaviour near zero lag distances describes non detected small scale variations and measurement errors.

1.8 /

/

J

v o.o o.6 o.4 to.2 o 0 h

[kml

Fig. 1. Omnidirectional cross-variogram (CS-Sr) of the raw data. 3.2 ANN Spatial Predictions This deals with preparing of training, testing, and validation data sets, scaling of the data, non-linear transformations. The following is a selection of ANN algorithms (architecture and learning model). In the present study two kinds of MLP models have been used. 1) 2 input neurons, describing spatial co-ordinates (X,Y-); one or two hidden layers; output neuron describing contamination (Cs137 or Sr90). 2) 2 input neurones, describing spatial co-ordinates; one or two hidden layers; two output neurones describing contamination (Cs137 and Sr90). The second one corresponds to ANN spatial co-estimations. Then it is necessary to train and to test the network. Backpropagation training with simulated annealing and genetic optimisation algorithms in order to avoid local minima were used [8]. The trained network has been evaluated by using jack-knife, cross-validation, and accuracy tests - prediction of the training data set with trained ANN. Accuracy test is used as a simple test describing how ANN captured the correlation between locations and contamination. The network has been validated by using independent data set. Then ANN is used for Cs137 and St90 spatial predictions/generalisations -

1128

mapping. Results for the Sr90 ANN mapping are presented in Fig.2. These results were obtained by using 5 hidden neurones and two output (Cs+Sr) neurones spatial co-estimations. It is evident that ANN have learned non-linear trends and that small scale variations have been ignored. By using more hidden neurones it was practically impossible to detect all small scale variations. X and Y co-ordinates are in cell numbers, cell size = dXdY=[Ixl] sq.km.

l~oliiiiiiiiiiiiiiiiiiiiiiiiiiiiiii!'iiiiii

iiii!ii{!ii!i!!Jiiiiiii!iiiii!iiiii!il

.

i'ii ......

¢= :1./

mq

o

1 .0

10

~0

-ii,;:iill iiii iii!ii!!iiiii!!iii!i!

30

.4-0

.50

6"0

0.5

0.3 (2).1

70

Fig.2. Sr90, ANN (one hidden layer with 5 neurones) spatial predictions. 3.3 Multivariate Geostatistical Analysis and Modelling of Residuals Trained neural network is able to extract some information from the data described by spatial correlations. The rest information should be analysed and modelled with the help of residuals. Obtained residuals are correlated to the original data and are not correlated to ANN estimates. Residuals for Cs137 and Sr90 are correlated. Exploratory. variography of spatial correlation structures (variograms and cross-variograms) of the residuals is presented in Fig.3. Both variograms and cross-variogram of the residuals can be easily modelled (fitting to theoretical models) and second-order stationary co-kriging model can be applied (cross-variogram reaches a sill and stabilises). Ranges (distance at which variogram reaches a sill) of the variograms and cross-variograms have been changed to shorter distances in comparison with Fig. 1. d

0.48

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

042 036 =

03 024 0.1@ 0.12

O.OS O0

....... ;

. . . .

10

15

2b

'

25

'

30

3

~;

......... 40

45 h [kl~r,]

Fig. 3. Omnidirectional cro&~-variogram (CS-Sr) of the AN?,[ residuals.

1129

Then geostatistical co-kriging model is applied to the residuals. Co-kriging is a linear estimation of the unknown residual Z'io(Xo) for Sr90 at unsampled point by using Cs137 and Sr90 residuals. It is determined by the following set of equations which is obtained from the conditions of unbiasedness and minimisation of variance [1,2]: N

.,It/

i=1

a=l

Z',o(Xo) =

~', goj7"ii(x,,

- xB) +/-zi = 7*i**(x,, - x o) i = 1,2, o~ = 1 .... n i ~r

i = 8.0

i = 1,2

# 3.4 Neural Network Residual Co-kriging Mapping

NNRCK = ANN predictions + geostatistical co-estimations of the residuals. The results of the mixed model are presented in Fig.4. The final stage is a validation of the NNRCK results. There is much more variability on the map in Fig.4 than on the map in Fig.2. NNRCK model describes both trends and small scale variations. NNRCK model is an exact one: it honours the measured data. When measurements errors are negligible at sampling points NNRCK estimate equals measured data.

1 1

Cilsq.k.~

i

9 ~

!i!ii~

7'

1 .O

~O.8

5, ~:i:!:

.3, 1,

10

'~O

30

40

~

60

70

Fig. 4. Mapping of Sr90 with neural network residual co-kriging model (NNRCK).

Several important points should be mentioned. I) analysis of the residuals is an important also in case when only ANN mapping is applied. This helps to understand the quality of ANN mapping. If there is no spatial correlations between residuals it means that all spatial information have been extracted and ANN can be used for prediction mapping as well. 2) Robustness of the approach: how is it sensible to the selection of the ANN architecture and learning algorithm? In [6] it

1!30 was shown that summary statistics of residuals described by variograms is robust versus ANN architecture - number of hidden layers and neurones. The same robust behaviour in multivariate case presented in this study has been obtained. So, we can choose the simplest network capable to learn and catch non-linear trends. Usually accuracy test have been used for the analysis and description of what have been learned by ANN. Accuracy test measures correlations between training data set and ANN predictions at the same points. 3). Data clustering is a well known problem in a spatial data analysis [2]. This problem is related to the spatial representativity of data. We have used spatial declustering procedures [2] for preparing all data sets.

4 Conclusions The new non-stationary NNRCK (Neural Network Residual Co-Kriging model) for the analysis and mapping of spatially distributed data has been developed. Nonlinear trends in multivariate environmental data can be efficiently modelled by the three layer perceptrons. The promising results presented are based on an important case study:, soil contamination by the most radiologically important Chernohyl radionuclides. Other kinds of ANN models (also local approximators) can be used with possible modifications. The approach seems to be useful in many cases when it is important to model and to remove non-linear trends or large scale spatial structures. Within the framework of the developed methodology the same approach can be used for stochastic co-simulations in case of probabilistic (risk) environmental mapping.

References 1. Wackernagel H. (1995) Multivariate Geostatistics. Springer Vertag, Berlin Heidelberg, 256 p. 2. Deutsch C.V. and Journel A. G. (1992) GSLIB: Geostatistical Software Library and User's Guide. New York, Oxford University Press. 340 p. 3. Dowd P. A. (1994) The Use of Neural Networks for Spatial Simulation. In R. Dimitrakopoulos (Ed.) Geostatistics for the next century, Kluwer Academic Publishers, pp. 173-184. 4. Dowla F.U. and Rogers L.L. (1995) Solving Problems in Environmental Engineering and Geosciences with Artificial Neural Networks. The MIT Press, Cambridge, Massachusetts, 239 p. 5. Kanevsky M. (1994) Artificial Neural Networks and Spatial Interpolations. Case Study: Chernobyl Fallout. Preprint IBRAE-95-07, 39 p. 6. Kanevsky M., Arutyunyan R., Bolshov L., Demianov V., and Maignan M. (1996) Artificial Neural Networks and Spatial Estimation of Chernobyl Fallout. Geoinformatics, vol.7, Nos. 1-2, pp.5-11. 7. Haykin S. (1994) Neural Networks. A comprehensive Foundation. New York, Macmillan College Publishing Co., 696 p. 8. Masters T. (1993) Practical Neural Network Recipes in C++. Academic Press, 493 p.

Suggest Documents