âSpatial Predictions of Soil Contamination Using General Regression

M. F. Kanevski. Spatial Predictions of Soil Contamination Using General Regression Neural Networks. Int. Journal of Systems Research and Information Systems, Volume 8, number 4. p. 241-256, 1999.

“Spatial Predictions of Soil Contamination Using General Regression Neural Networks”

Mikhail F. Kanevski

Contact address: Institute of Geomatics and Analysis of Risk (IGAR), University of Lausanne, Switzerland [email protected] www.unil.ch/igar

1


Spatial Predictions of Soil Contamination Using General Regression Neural Networks Abstract. The work deals with the application of general regression neural networks (GRNN) to the spatial predictions of radioactively contaminated territories. GRNN are a class of neural networks widely used for the continuous function mapping. They are based on a well known nonparametric (kernel) statistical estimators. An important advantage of the GRNN is that training is very fast and adding new data is almost free. Good mapping possibilities of the GRNN are demonstrated on real data. The detailed analysis of the residuals including univariate statistics and variography is presented. The case study is based on the real data sets on surface contamination by Chernobyl Cs137 radionuclide. Key words: spatial predictions, artificial neural networks, radioecology

INTRODUCTION There is a great variety of modern approaches and tools (geostatistics, artificial neural networks, fractals, wavelets, etc.) dealing with analysis and modelling of spatially distributed and time dependent data. The selection of the appropriate model depends on quality and quantity of data and final objectives of the study. One of the most developed approach is based on geostatistical methodology and models (Cressie 1991, Deutsch and Journel 1998, Goovaerets 1997). An important advantage of the geostatistical models is their ability to quantify the quality of spatial predictions with the help of error maps (maps of estimation variances). Geostatistical methods (family of kriging models) belong to the best linear unbiased predictors (BLUP). Last achievements in geostatistics deal with stochastic simulations and risk mapping. Most of the geostatistical models rely on deep expert analysis (e.g., exploratory variography and modelling of spatial correlation structures) and are based on some theoretical assumptions which rarely can be verified (e.g. stationarity). High variability at different spatial scales and non-linear trends are typical for environmental and pollution data. An accident occurred at the Chernobyl nuclear power plant in Ukraine on April 26, 1986 appeared to be one of the most serious catastrophe in nuclear power The radioactive materials released from the destroyed reactor were transported over large distances and contaminated vast territories in Europe. From different points of view Chernobyl case study is a unique one with many lessons to be learned. The Chernobyl nuclear accident resulted in environmental, radiological, social, political, psychological, economic consequences in Russia, Belarus and the Ukraine. Spatial patterns of the Chernobyl fallout are very sophisticated and highly spotty. This is a result of complex meteorological, physical and chemical processes, atmospheric turbulence, wet and dry deposition, orography, etc. High variability of the surface contamination by different radionuclides at different geographical scales (from meters in the populated sites to hundreds of kilometres - regional and continental scales) complicates the analysis, processing and presentation of the data and the results. At present time the most important from radiological point of view are Cs137 and Sr90 radionuclides having

2


about 30 years decay half-time. Recent analysis of surface contamination by Chernobyl radionuclides based on geostatistical methodology is presented in (Kanevski et al. 1997). Alternative to geostatistics approach to the analysis and modelling of spatial data is based on artificial neural networks which offer two primary attractions: learning and representations. One of the principle advantage of the neural networks is their ability to discover patterns in data when data exhibit significant unpredictable non-linearity. However, ANN have some problems with interpretability of the results. Recently artificial neural networks and geostatistical models have been combined in a mixed approach for the analysis and modelling of spatially distributed data - neural network residual kriging/co-kriging (NNRK/NNRCK) and neural network residual simulated annealing (NNRSA) models (Kanevski et al., 1996). It was shown that feedforward networks (multilayer perceptrons) can be efficiently used to model non-linear large scale trends and then residuals can be easily analysed (interpolated/simulated) with geostatistical interpolators/simulators. Multilayer perceptron belongs to the class of global universal approximators (Haykin, 1994). There are several local ANN universal approximators, e.g., radial basis function (RBF) neural networks and general regression neural networks (GRNN) (Specht 1991). GRNN belongs to the well known nonparametric kernel regression models (Hardle 1989, Fan and Gijbels 1997). Specht’s version of the GRNN is based on the Nadaraya-Watson estimator with Gaussian type kernels (Nadaraya 1964, Watson 1964). In fact this estimator is a subclass of the local polynomial models (Ruppert and Wand 1994; Fan and Gijbels 1997). A comprehensive study of local modelling approaches and their applications can be found in (Fan and Gijbels, 1997) which also contains an exhaustive bibliography of the literature. It should be noted that with the help of GRNN not only conditional mean value at unsampled locations can be estimated but also higher moments as well. Conditional variance of the estimate and standard deviation can be derived from the available training data set. An important byproduct of the GRNN network is Bayesian posterior probabilities. The GRNN model has a solid mathematical background to support confidence estimates. Training of the network is very fast, adding new data is almost free - one-shot learning, not incremental. GRNN model has some disadvantages which are common to any nonparametric kernel methods (Hardle 1989, Fan and Gijbels 1997). In the nonparametric setting all approximations F are biased estimators of the unknown function G, because there does not typically exist a function F* with finite number of parameters able to approximate G. The GRNN architecture can be applied to continuous smooth function mapping. It is requires a thoroughly prepared representative data set. In case of real environmental and pollution mapping (2D or 3D) there is no problem with curse of dimensionality. The main objectives of the present research deal with the use of GRNN for the spatial predictions/estimations of radioactively contaminated territories and analysis of the results and residuals by using standard statistical tools and variography. An additional to standard GRNN output the map of conditional variance of the estimate is presented. Probabilistic mapping (maps of probability of exceeding for the intervention/countermeasure contamination level) can be performed by taking into account theorem on asymptotic normality (Hardle 1989, Fan and Gijbels 1997). Another possibility to obtain probability maps is to estimate indicators by using GRNN instead of indicator kriging (Ali and Lall 1996). Some extensions of the nonparametric regression approach were made so that it too displayed kriging features were carried out

3


in (Yakowitz and Szidarovszky 1985). In particular, a data-driven estimator of the expected square error was derived.

GENERAL REGRESSION NEURAL NETWORKS The theory of general regression neural networks is founded on multivariate kernel density estimation and multivariate kernel regression (Nadaraya 1964; Watson 1964; Hardle 1989; Hardle and Muller 1997; Fan and Gijbels 1997; Ruppert and Wand 1994). The goal of multivariate nonparametric estimation (smoothing) is to approximate * * the probability density function F(z1 , ..., zm ,) of the «m» random variables z=(

z1, ..., zm)T

by using «n» measurements for each of the variable. The multivariate kernel density estimator in the m-dimensional case is defined as:

z i 1 − z1* z im − z m* 1 n 1 F (z ) = ∑ K( , ... , ) n i = 1 h1 ... h m h1 hm 1 *

(1)

K is multivariate kernel function, and bandwidth (smoothing parameters) vector h = (h1 , ... hm )T. where

There are several basic kernel functions frequently used in the kernel density estimates and regressions (Fan and Gijbels 1997). The requirements on a kernel functions are straightforward. Usually kernels are symmetric and smoothly decreasing functions from maximum value at zero. Discontinuities in kernel functions lead to discontinuities in the predictions. The smoother the kernel function, the smoother the estimated function. In general, the basic problems of the spatial data predictions are following: relying on the original data sets Z(xi,yi), i=1,...n (measurements of the soil contamination): 1) develop spatial predictor and estimate concentrations of pollution at unsampled points Z(x,y), 2) 2) give indication of the quality of this prediction (e.g., variance of the estimates), 3) derive a probabilistic map - map of probability of exceeding some definite (intervention/countermeasure) contamination level Zint. In the present study original data base was split into the training data set used for model development and validation data used for the estimate of model’s quality. Assume that training data came from a sampling process that mesures output values with additive random noise:

Z

i

= E [Z |x , y ] + ε

where εi are independent, have zero mean, and have variance σ

4

(2)

i

2

(x,y).


The conditional mean of Z given (x,y) known as a regression of Z on (x,y) is the solution minimising mean squared error. If f(x,y,Z) is the joint continuous probability density function then the conditional mean can be expressed by the following relation: ∞ ∫− ∞ z f ( x , y ; z ) d z (3) E [Z |x , y ] = ∞ ∫− ∞ f ( x , y ; z ) d z If the error is normally distributed and homoskedastic

εi=N(0,

σ

2 ),

the

regression estimate becomes the best linear unbiased estimate in the maximum likelihood sense.

Let us consider basic formulas concerning GRNN used in the study. This part of the work is based on (Specht, 1991). The density function f(x,y,z) can be estimated from the data by using nonparametric consistent estimators proposed by Parzen and in multidimensional case developed by Cacoullos (Silverman 1986, Specht 1991):

f ( x, y, Z ) =

n

1 [(2π )

3/ 2

∑ exp( − D h n] 3

2 i

i =1

/ 2h 2 ) exp[ − ( Z − Z i ) 2 / 2h 2 ] (4)

where Gaussian kernel is used , and n is the number of measurements in training data set, h - is a bandwidth, and distance metric is:

D i2 = ( x − x i ) 2 + ( y − y i ) 2

(5)

Substituting the joint probability estimate into the conditional mean (1), gives the desired conditional mean Z given (x,y), also called Nadaraya-Watson kernel estimator (Hardle, 1989): n

Z m ( X ,Y ) =

∑

i=1 n

∑

Z i e x p ( − D i2 / 2 h 2 )

=

exp(− Di / 2h ) 2

n

∑

ω i ( X ,Y ) Z i

i=1

2

i=1

(6) where the weights are:

ω i ( X ,Y ) =

e x p ( − D i2 / 2 h 2 ) n

∑

n

exp(− Di / 2h ) 2

2

i=1

5

;∑ ω i=1

i

( X ,Y ) = 1

(7)


By using distribution function, higher moments can also be estimated as: n

Z

(k ) m

∑

( X ,Y ) =

(Z

i

)

e x p ( − D i2 / 2 h

k

2

)

i=1 n

∑

e x p ( − D i2 / 2 h

2

(8)

)

i=1

The conditional variance of the estimate is (Yakowitz and Szidaroszky 1995; Fan and Gijbels 1997):

V arZ m ( x , y ) =

n

∑

σ 2 ( x , y ) ω i2 ( x i , y i )

(9)

i=1

2 where residual based estimator of σ (x,y) is: n

^ 2

σ

∑

(x, y) =

(Z − Z

m

)2 exp(− D

2

/ 2h

i

i = 1 n

∑

exp(− D

2

) (10)

2 i

/ h

2

)

i = 1

The equation (9) can be modified to reflect both the additive noise in the sampling (measurement errors) at the new point (x,y) and the prediction error of the estimator (Atkeson et al. 1996). It should be noted that, in general, bandwidth parameter differs for the mean value estimations (regression) from the one of conditional variance estimations. Relying on the relation (4) neural network implementation have been developed by Specht (Specht, 1991). For the conditional mean prediction neural network is presented in fig. 1. As usually the input units are distribution units, which provide all of the scaled measurement variables (x,y) to all of the neurons on the second layer. The second pattern unit is dedicated to one exemplar. The activation function used in the present study is the exponential, although other activation functions can be used. The pattern unit outputs are passed to the two summation units U and V which perform a dot product between a weight vector and a vector composed of the signals from the pattern units:

U

=

n

∑

Z

i

exp(− D

2 i

/ 2h

2

)

(11)

)

(12)

i= 1

V

=

n

∑

e x p ( − D

i = 1

6

2 i

/ 2 h

2


Output unit merely divides U by V to yield desired estimate of Z (in our case Cs137 concentration). In the present work the same kind of networks were developed for the prediction of higher moments (5) and variance (6). The problem during network learning/training by using training data set is to find an unknown smoothing parameter h. In the present study cross-validation was used and the quality of training has been studied by analysing residuals with the help of univariate and spatial statistics. After learning, the network should be tested and validated and then can be used for generalisations, making predictions at unsampled points. Smoothing parameter h influences upon the type of solution. When h is small (h

→ 0) the solution converges to interpolation (i.e., Zm → Zi if (x,y) → (xi,yi)). When h is large the smoothing is applied and the solution converges to approximation. If h→∞ Zm →ΣZi /n - the sample mean of the observations. By changing the

smoothing parameter the quality and quality of extracted spatial information (described by variograms as well) can be controlled. Of course, different GNNR improvements, besides presented here classical vanilla version are under development and study: generalisation of distance metric, different sigma values in different directions (anisotropic window (hx ≠ hy), generalisation to multivariate data, moving window version with local adaptation of sigma, etc.

CASE STUDY The case study is based on regional [approximately (120x80) sq. km] irregularly sampled data on soil contamination in Western part of Briansk region by Cs137 radionuclide. This is the most contaminated region in Russia due to Chernobyl accident and was studied by using both traditional and geostatistical models. The problem that complicates the study and mapping is high variability of fallout at different scales. As usually when dealing with environmental data non-linear large scale trends in the region are important. The original data base has been split in to two data sets: training and validation. The last one is used as an independent additional independent measurements in order to estimate the quality of trained network. The basic information (batch statistics) of the training and validation data sets are presented in table1. Validation data set consists of 90 measurements selected from the region by using spatial declustering procedure in order to obtain spatially representative validation data set. Region of investigation is covered by regular grid (10x15). From each of the 150 cells randomly one sample (if there is any) is picked up and added to validation data set. Thus, monitoring network of the validation data set is more homogeneous than training (rest) data set. Such procedure gives representative validation data set: validation data set cover region of the study. Let us mention that random selection will reproduce spatial clustering of original data. Of course, there are other possibilities how to split original data into validation and training sets. Moreover, the representativity and clustering of the data (spatial and dimensional resolutions of the monitoring networks) influences final results independently on methods of spatial data modelling have been used. It seems reasonable to use spatial declustering procedure in order to obtain representative validation data set from original

7


non-homogeneous/clustered monitoring network. Detailed discussion of this interesting and important problems is outside the framework of this work. The monitoring networks (distribution of sampling points in space) are presented with the help of Delauney triangulations in fig.2. Two isolated low value points in the north-east part of the region belong to validation data set and are in extrapolation region in relation to training data set. Table1. Basic statistical parameters. Measurement units of concentration in soil are Ci/km2 (1Ci/km2 = 37 kBq/m2 )

No. of points Min Median Maximum Mean Variance Sigma Skewness Kurtosis

Training Data 196 1.03 11.0 39.8 13.3 61.1 7.82 1.05 0.51

Validation Data 90 2.1 10.1 34.97 12.1 62.7 7.92 0.92 0.04

The second step is spatial estimations. The only unknown parameter is the presented above isotropic version of the GRNN is value of bandwidth (h ). There are several methods how bandwidth can be selected adaptively. In anisotropic case in each direction different values are used. In more general case Mahalanobis distance with full covariance matrix in the distance metric can be used (Fan and Gijbels 1997). In the present study global cross-validation (leave-one-out) technique was applied to select the optimal value of smoothing parameter. Global cross-validation can be a particularly robust method for tuning parameters, because it does not make any special assumptions. Independent of the noise distribution, data distribution and underlying function, the cross-validation value is an unbiased estimate of how well a given set of parameters will perform on new data drawn from the same distribution as the training data. It should be noted, that this robustness has lead to the use of global cross-validation in application that attempt to achieve high autonomy by making few assumptions, such as General Memory Based Learning system (Atkeson et al. 1996). Global cross-validation error curve is presented in Fig.3. Approximately at 3 km error curve reaches minimum. This value of smoothing parameter is then applied for spatial estimations. Residuals of the GRNN estimator are analysed by using univariate batch statistics and variography. The basic statistical parameters of the training data set residuals are presented in Table 2.

8


Table 2. Batch statistics of the training data set residuals. Minimum Median Maximum Mean Variance Standard deviation Skewness Kurtosis

-7.81 0.15 5.28 0.032 3.81 1.95 -0.85 2.1;

Experimental variography of the training data set, estimates of the training data set by GRNN and corresponding residuals are presented in Fig. 4 with omnidirectional variograms. After training (selection of the unknown bandwidth value by using crossvalidation) variogram of the residuals shows pure nugget effect - no spatial correlations. (Fig.4). In other words it means that all information related to the spatial correlation enclosed in the training data set has been extracted by GRNN. Variogram of the estimates qualitatively follows variogram of the training data set. The difference between training data and estimates variograms corresponds to the variogram of the residuals. GRNN estimates at the points of training data set versus training data (accuracy test) are presented in Fig. 5. Correlation between estimates and measurements is rather high = 0.97 (batch statistics of the residuals is presented in Table 2.). As it was noted above, by varying smoothing parameter it is possible to obtain predictions of different quality and to balance between bias and variance and between local and global estimations. By changing bandwidth it is possible to smooth out small scale variations and to model only large scale nonlinear trends. Correlated residuals then can be analysed (estimated/simulated) with the help of geostatistical methods. Such approach was used in NNRK and NNRSA models (Kanevski et al., 1996). Validity domain of the spatial estimates may be obtained using some threshold on the estimation of the marginal probability density function (pdf). In figure 6 the function V(x,y) , equation (12), is presented: This function is proportional to the marginal pdf when the bandwidth is constant. Actually, this function represents distribution of sampling points in space - the training data monitoring network design. Outside the interpolation region this function is negligible. The result of the GRNN spatial predictions obtained at the best bandwidth value (the lowest error of cross-validation) is presented in Fig. 7. Corresponding map of the estimated conditional variance is presented in Fig. 8. This kind of spatial information can be used in order to estimate uncertainties of the predictions and to derive confidence intervals. In asymptotic limit due to the theorem of asymptotic normality it is possible to derive risk maps - probability maps of exceeding definite contamination levels. Thus, the GRNN spatial predictions quantitatively reproduce regional and small scale variations and major spots. The results of the validation test (predictions of the validation data set never used for training) for the GRNN model are presented in figure 9. Batch statistics of the validation residuals (statistics of the generalisation error) are derived in Table 3. Structural analysis (variography) of the validation residuals shows pure nugget effect (no spatial correlation) but with higher value than for the training data set residuals.

9


Table 3. Batch statistics of the validation data set residuals. Minimum -12.9 Median 0.71 Maximum 13.8 Mean 0.54 Variance 14.4 Standard deviation 3.8 Skewness -0.34 Kurtosis 2.68

CONCLUSIONS The work presents the application of general regression neural networks to the mapping of radioactively contaminated territories. General regression neural networks are fast, simple and clear models. They are based on a well elaborated mathematical background - multivariate kernel regression methods, which have long successful statistical history. Results obtained in the work on spatial predictions of soil contamination by using general regression neural networks are promising. All information described by univariate statistics and variograms have been extracted from the data by GRNN. The residuals demonstrate pure nugget effect which quantitatively equals to the nugget effect in the training data. It is important that higher moments and variance of the predictions can be also estimated. By varying smoothing parameters it is possible to train the GRNN model for extracting large scale structures (nonlinear trends). Kernel regression multivariate density estimators can be used for the spatial predictions of the conditional pdf and corresponding risk mapping (probability of exceeding of predefined intervention/countermeasure contamination levels). ACKNOWLEDGEMENTS The work was supported in part by INTAS grant 96-1957. The author thanks to Prof. M. Maignan and to Prof. J. Fan for the valuable comments. REFERENCES Ali A.I., and Lall U. (1996) A Kernel Estimator for Stochastic Subsurface Characterization. Ground Water. Vol. 34, No.4, pp. 647-658. Atkeson C.G., Moore A.W., and Schaal S. (1996) Locally Weighted Learning. htpp://www.cc.gatech.edu/fac/Chris.Atkeson. Canu, S., Soltani, S., et al. (1996) Neural Networks and Other Flexible Regression Estimators for Spatial Interpolation. AIHENP International conference, Lausanne. Cressie N. (1991) Statistics for Spatial Data. New York: Academic Press. Deutsch C.V. and Journel A.G. (1998) GSLIB. Geostatistical Software Library and User’s Guide. N.Y., Oxford University Press. Fan J. and Yao Q. (1997). Efficient Estimation of Conditional Variance Functions in Stochastic Regression. http:// Fan J. and Gijbels I. (1997) Local Polynomial Modelling and Its Applications. Monographs on Statistics and Applied Probability 66. London, Chapman and Hall.

10


Goovaerets P. (1997) Geostatistics for Natural Resources Evaluation. N.Y, Oxford University Press. Hardle, W (1989). Applied Nonparametric Regression. Cambridge: Cambridge University Press. Hardle W. And M. Muller (1997). Multivariate and Semiparametric Kernel Regression. In: M.G. Schimek (Ed.) Smoothing and Regression. Approaches and Applications. Hastie T. and Loader C. (1993) Local Regression: Automatic Kernel Carpentry. Statistical Science, vol.8, No.2, pp. 120-143. Haykin, S. (1994). Neural Networks. A Comprehensive Foundation. New York: Macmillan College Publishing Company. Kanevsky, M., Arutyunyan, R., et al. (1996) Artificial neural networks and spatial estimations of Chernobyl fallout. Geoinformatics. 7, 1-2, 1996, 5-11. M. Kanevsky, R. Arutyunyan, L. Bolshov, S. Chernov, V. Demyanov, N. Koptelova, I. Linge, E. Savelieva, T. Haas, M. Maignan. (1997) Chernobyl Fallout: Review of Advanced Spatial Data Analysis. In «geoENV I - Geostatistics for Environmental Applications» A. Soares, J. Gomez-Hernandez, R. Froidevaux (Eds.) Kluwer Academic Publishers, pp. 389-400. Masters, T. (1995) Advanced Algorithms for Neural Networks. New York: John Wiley & Sons. Nadaraya E.A. (1964) On Estimating Regression. Theory Probab. Appl.. vol.9, pp.141142 Ruppert D. and Wand M.P. (1994). Multivariate Locally Weighted Least Squares Regression. The Annals of Statistics. vol. 22, No.3, pp. 1346-1370. Silverman B. W. (1986). Density Estimation for Statistics and Data Analysis. Vol.26 of Monographs on Statistics and Applied Probability. Chapman and Hall, London. Specht, D. (1991) A General Regression Neural Network. IEEE Trans. on Neural Networks, 2, 6, 568-76. Watson G.S. (1964) Smooth Regression Analysis. Sankhya Ser.A vol. 26, pp. 359-372. Yakowitz S.J. and Szidarovszky F. (1985) A Comparison of Kriging with Nonparametric Regression Methods. Journal of Multivariate Analysis. Vol. 16, pp. 2153.

11


Figure captions of the paper «Spatial Predictions of Soil Contamination Using General Regression Neural Networks» by M.Kanevski Figure 1. General Regression Neural Networks. Figure2. Training (left) and validation (right) data sets monitoring networks. Figure 3. Cross-validation error curve. Figure 4. Variograms of the training data set, GRNN estimates and residuals. Figure 5. GRNN estimates of the training data set versus measurements. Accuracy test. Figure 6. Validity domain. Figure 7. GRNN spatial estimates of soil contamination. Figure 8. Conditional standard deviation of the GRNN estimates (square root of conditional variance). Figure 9. Validation test.

12


Figure 1. General Regression Neural Networks.

Figure2. Training (left) and validation (right) data sets monitoring networks.

13


Figure 3. Cross-validation error curve.

Figure 4. Variograms of the training data set, GRNN estimates and residuals.

14


Figure 5. GRNN estimates of the training data set versus measurements. Accuracy test.

Figure 6. Validity domain.

15


Figure 7. GRNN spatial estimates of soil contamination.

Figure 8. Conditional standard deviation of the GRNN estimates (square root of conditional variance).

16


Figure 9. Validation test.

17

âSpatial Predictions of Soil Contamination Using General Regression

âSpatial Predictions of Soil Contamination Using General Regression

Suggest Documents

spatial prediction of radioactivity using general regression neural ...

Assessment of soil contamination using ToxAlert test

Spatial control of groundwater contamination, using principal ...

Multivariate spatial smoothing using additive regression splines

soil contamination

Soil and Sediment Contamination

Soil Contamination at Dumpsites

Soil and Water Contamination

soil contamination - EPDF.TIPS

soil contamination

using legacy soil data for standardizing predictions of topsoil clay ...

Assessment of Soil contamination by E-Waste using ...

Mapping of soil contamination by using artificial ... - Semantic Scholar

Regression Analysis of Soil Compressibility

Soil erosion model predictions using parent material ... - Forest Service

IRJET- Geotechnical Investigation of Different Soil Samples using Regression Analysis

Spatial Prediction of Landslide Hazard Using Logistic Regression and ...

predictions in linear regression model with

Theoretical predictions for spatial covariance of the ...

Theoretical predictions for spatial covariance of the

Testing general predictions of the stress gradient

Centrifuge Modeling of Soil Contamination and ...

EFFECT OF SOIL CONTAMINATION WITH HEAVY

Ecological risk assessment of soil contamination by

âSpatial Predictions of Soil Contamination Using General Regression