Statistical Evaluation of Spatial Interpolation Methods for Small-Sampled Region. 45. Kriging, etc. provide better estimates of temperature than conventional ...
Statistical Evaluation of Spatial Interpolation Methods for Small-Sampled Region: A Case Study of Temperature Change Phenomenon in Bangladesh Avit Kumar Bhowmik and Pedro Cabral Instituto Superior de Estatística e Gestão de Informação, ISEGI, Universidade Nova de Lisboa, 1070-312 LISBOA, Portugal {m2010161,pcabral}@isegi.unl.pt
Abstract. This study compares three interpolation methods to create continuous surfaces that describe temperature trends in Bangladesh between years 1948 and 2007. The reviewed techniques include Spline, Inverse Distance Weighting (IDW) and Kriging. A statistical assessment based on univariate statistics of the resulting continuous surfaces indicates that there is little difference in the predictive power of these techniques making hard the decision of selecting the best interpolation method. A Willmott statistical evaluation has been applied to minimize this uncertainty. Results show that IDW performs better for average and minimum temperature trends and Ordinary Kriging for maximum temperature trends. Results further indicate that temperature has an increasing trend all over Bangladesh noticably in the northern and coastal southern parts of the country. The temperature follows an overall increasing trend of 1.06oC per 100 years. Keywords: Spatial Interpolation, Spline, Inverse Distance Weighting, Ordinary Kriging, Univariate Statistics, Willmott Statistics.
1 Introduction The temperature change over the past 30–50 years is unlikely to be entirely due to internal climate variability and has been attributed to changes in the concentrations of greenhouse gases and sulphate aerosols due to human activity [1]. Although global distribution of climate response to many global climate catalysts is reasonably congruent in climate models, suggesting that the global metric is surprisingly useful; climate effects are felt locally and they are region-specific [2]. Spatial interpolation methods have been used to quantify region-specific changes of temperature based on historical data [3]. There is no single preferred method for data interpolation being selection criteria based on the data, the required level of accuracy and the time and/or computer resources available. Geostatistics, is based on the theory of regionalized variables [4, 5 & 6] and allows to capitalize on the spatial correlation between neighboring observations to predict attribute values at unsampled locations. Geostatistical spatial interpolation prediction techniques such as Spline, IDW, B. Murgante et al. (Eds.): ICCSA 2011, Part I, LNCS 6782, pp. 44–59, 2011. © Springer-Verlag Berlin Heidelberg 2011
Statistical Evaluation of Spatial Interpolation Methods for Small-Sampled Region
45
Kriging, etc. provide better estimates of temperature than conventional methods [7 & 8]. Results strongly depend on the sampling density and, for high-resolution networks, the kriging method does not show significantly greater predictive power than simpler techniques, such as the inverse square distance method [9]. This study compares three spatial interpolators - Spline, IDW, and Kriging – with the goal of determining which one creates the best representation of reality for measured temperatures between years 1948 and 2007 in Bangladesh. Specifically this study aims to describe the temperature change phenomenon in the region by: (1) describing the overall and station specific Average, Maximum and Minimum temperature using trend analysis of the historical dataset; (2) interpolating the trend values obtained from trend analysis; and (3) evaluating the interpolation results using Univariate and Willmott Statistical methods, thus identifying the most appropriate interpolation method. Additionally, the benefits and limitations of these commonly used interpolation methods for small-sampled areas are discussed. This assessment is important because much of geographic research includes the creation of data for spatial analysis. Selecting an appropriate spatial interpolation method is key to surface analysis since different methods of interpolation result in different surfaces.
2 Study Area Bangladesh is one of the countries most likely to suffer adverse impacts from anthropogenic climate change [2]. Threats include sea level rise (approximately one fifth of the country consists of low-lying coastal zones within 1 meter of the high water mark), droughts, floods, and seasonal shifts. The total area of the country is 147,570 sq.km. with only thirty-four meteorological stations to measure rainfall and temperature all over the country by the Bangladesh Meteorological Department [11] (Fig. 1). A number of studies carried out on trend of climate change in climatic parameters over Bangladesh have pointed out that the mean annual temperature has increased during the period of 1895-1980 at 0.310C over the past two decades and that the annual mean maximum temperature will increase to 0.40C and 0.730C by the year of 2050 and 2100 respectively [1, 2, 12, 13 & 14]. In this context, it is essential to quantify region-specific changes of temperature in Bangladesh in recent years based on historical data.
3 Data and Methods 3.1 Data Daily temperature data from 1948 to 2007 collected from 34 fixed meteorological stations of the Bangladesh Meteorological Department were used in this study. Average, maximum and minimum daily, monthly and yearly temperature have been derived from this dataset and have been used for further trend analysis. Microsoft Excel 2007 has been used for trend analysis and ArcGIS 9.3.1 and GeoMS [15] have been used for the spatial interpolation of the trend values.
46
A.K. Bhowmik and P. Cabral
Fig. 1. Study area-Bangladesh with the location of thirty four meteorological stations
3.2 Trend Analysis Trend analysis is the most commonly used process to describe the temperature change phenomenon of a region using historic data [1]. This is the simplest form of regression, linear regression, and uses the formula of a straight line (1). y = a + bx
(1)
The equation determines the appropriate values for a and b to predict the value of y based upon a given value of x. Linear regression assumes that an intercept term is to be included and takes two parameters: the independent variables (a matrix whose columns represent the independent variables) and the dependent variable (in a column vector). Trend analysis by linear regression is less affected by large errors than least squares regression [16]. For the region-specific analysis, trend values of average, maximum and minimum temperature change have been calculated for every stations using the formula (2).
(2)
Where, xi is the independent variable, x is the average of the independent variable, yi is the dependable variable and y is the average of dependable variable. If the value of b is positive then the dataset shows an increasing trend. If it is negative, the dataset
Statistical Evaluation of Spatial Interpolation Methods for Small-Sampled Region
47
shows a decreasing trend. The higher the value of b the higher is the trend of change. One way of testing significance of trends of temperature is calculating the Coefficient of Determination, R2 of the trend (3). Values of R2 vary between 0 and 1.
(3)
Highest correlation of the dataset can be found at 1 and it gradually reduces towards zero. Value less than 0.5 has been considered as less significant correlation. 3.3 Spatial Interpolation The idea and mechanism of spatial interpolation for the study was generated from [3]. Interpolation is a method or mathematical function that estimates the values at locations where no measured values are available. It can be as simple as a number line; however, most geographic information science research involves spatial data. Spatial interpolation assumes that attribute data are continuous over space. This allows for the estimation of the attribute at any location within the data boundary. Another assumption is that the attribute is spatially dependent, indicating the values closer together are more likely to be similar than the values farther apart. These assumptions allow for the spatial interpolation methods to be formulated [3]. Spatial interpolation is widely used for creating continuous data from data collected at discrete locations, i.e. points. These point data are displayed as interpolated surfaces for qualitative interpretation. In addition to qualitative research, these interpolated surfaces can also be used in quantitative research from climate change to anthropological studies of human locational responses to landscape [3]. However, when an interpolated surface is used as part of larger research project [8] both the method and accuracy of the interpolation technique are important. The goal of spatial interpolation is to create a surface that is intended to best represent empirical reality thus the method selected must be assessed for accuracy. The techniques assessed in this study include the deterministic interpolation methods of Spline [8] and IDW [8] and the stochastic method of Kriging [8] in an effort to retain actual temperature trend measurement in a final surface. Each selected method requires that the exact trend values for the sample points are included in the final output surface. The Spline method can be thought of as fitting a rubber-sheeted surface through the known points using a mathematical function. In ArcGIS, the spline interpolation is a Radial Basis Function (RBF). The equation for k-order B-spline with n+1 control points (P0 , P1 , ... , Pn ) is (4): P(t) = ∑i=0,n Ni,k(t) Pi ,
tk-1