INTERNATIONAL JOURNAL OF CLIMATOLOGY Int. J. Climatol. 30: 620–631 (2010) Published online 9 April 2009 in Wiley InterScience (www.interscience.wiley.com) DOI: 10.1002/joc.1913
Comparison of different geostatistical approaches to map climate variables: application to precipitation Francisco J. Moral* Department of Graphic Representation, University of Extremadura, 06071 Badajoz, Spain
ABSTRACT: The benefits of an integrated geographical information system (GIS) and a geostatistics approach to accurately model the spatial distribution pattern of precipitation are known. However, the determination of the most appropriate geostatistical algorithm for each case is usually neglected, i.e. it is important to select the best interpolation technique for each study area to obtain accurate results. In this work, the ordinary kriging (OK), simple kriging (SK) and universal kriging (universal kriging) methods are compared with three multivariate algorithms which take into account the altitude: collocated ordinary cokriging (OCK), simple kriging with varying local means (SKV) and regression-kriging (RK). The different techniques are applied to monthly and annual precipitation data measured at 136 meteorological stations in a region of southwestern Spain (Extremadura). After carrying out cross-validation, the smallest prediction errors are obtained for the three multivariate algorithms but, particularly, SKV and RK outperform collocated OCK, which needs a more demanding variogram analysis. These algorithms are easily implemented in a GIS, requiring the residual estimates and map algebra capability to generate the final maps. Results evidence the necessity of accounting for spatially dependent precipitation data and the collocated altitude, to accurately define monthly and annual precipitation maps. Copyright 2009 Royal Meteorological Society KEY WORDS
kriging; precipitation; altitude; geographical information system; regression
Received 11 July 2008; Revised 22 January 2009; Accepted 8 March 2009
1.
Introduction
There are many different areas of research (e.g. climatology, agriculture, ecological modelling, hydrology) that require interpolated surfaces or gridded datasets of climate variables. Consequently, there have been numerous attempts made at spatial interpolation using a variety of methods. Surfaces of climate variables have been interpolated, using point data, for areas ranging from a few thousand square kilometres (Ninyerola et al., 2000; VicenteSerrano et al., 2003) to the continental scale (Hulme et al., 1995, 1996) and even for the entire globe (Willmott and Robeson, 1995). The main problem, previous to the selection of the most appropriate estimation technique, is related to the availability of climatic data. Sometimes data are recorded at permanent but too much disperse weather stations, especially in mountainous areas, where climatic values are more difficult to predict due to the complex topography. Even in flatter areas, weather stations should be properly distributed to detect the influences of air flows, surrounding mountains, thermal inversions and other phenomena that could affect the climatic patterns.
* Correspondence to: Francisco J. Moral, Department of Graphic Representation, University of Extremadura, 06071 Badajoz, Spain. E-mail:
[email protected] Copyright 2009 Royal Meteorological Society
The spatial interpolation methods differ in their assumptions, deterministic or statistical nature, and local (they use the data of the nearest sampling points to estimate at unsampled locations) or global (they use the data of all sampling points to estimate at unsampled sites) perspective (Burrough and McDonnell, 1998). Some examples related to the use of deterministic techniques can be found in the works of Legates and Willmott (1990) – inverse distance weighting; Hutchinson and Gessler (1994) – splines; Agnew and Palutikof (2000) or Vicente-Serrano et al. (2003) – empirical multiple regressions. However, it is recognized that the statistical approach, geostatistical methods or kriging, has several advantages over the deterministic techniques (Isaaks and Srivastava, 1989; Goovaerts, 1997). Nowadays, geostatistics is widely used in climate mapping (Atkinson, 1997; Goovaerts, 1997). The fact of giving unbiased predictions with minimum variance and taking into account the spatial correlation between the data recorded at different weather stations is an important advantage of kriging. Some studies have shown that kriging provides better estimates than other techniques (e.g. Phillips et al., 1992; Goovaerts, 2000), but other authors have found that results depend on the sampling density (Dirks et al., 1998). A major advantage of kriging over simpler methods, besides providing a measure of prediction error (kriging variance), is the possibility of complementing the sample data, when they are sparse, by secondary or auxiliary information which can help with interpolation.
DIFFERENT GEOSTATISTICAL APPROACHES TO MAP CLIMATE VARIABLES
Those sources of knowledge are: (1) data from a cheapto-measure covariable which is known at many more points, and (2) an empirical spatial model of a driving process. For precipitation, weather–radar observations can be the secondary data, as Azimi-Zonooz et al. (1989) and Raspa et al. (1997) considered to estimate precipitation fields using multivariate extensions of kriging (cokriging and kriging with an external drift, respectively). However, Goovaerts (2000) suggested the use of altitude from a digital elevation model (DEM) as another valuable and cheaper source of auxiliary data. It is known that precipitation is higher with increasing elevation, due to the orographic effect of mountainous areas where the air is lifted vertically and the condensation generates because of adiabatic cooling. Goovaerts (2000) showed that geostatistical algorithms outperform deterministic techniques and, especially, multivariate extensions of kriging, where the altitude is considered, generate the best results. More recently, Diodato (2005) also found better estimates when ordinary cokriging (OCK), considering altitude as the auxiliary data, is compared with ordinary kriging (OK). During the last years, some mixed interpolation techniques have been developed, combining kriging and the secondary information. According to Hengl et al. (2003), these methods can be classified depending on the properties of input data. When the number of secondary variables is low and these auxiliary data are not available at all grid-nodes, cokriging is the most appropriate interpolation technique. If auxiliary data are available at all grid-nodes and correlated with the primary or target variable, kriging with a trend model or external drift (Hudson and Wackernagel, 1994; Bourennane et al., 2000) is the correct interpolation method. This non-stationary geostatistical technique has three different approaches from a computational point of view. In the first, called universal kriging (UK), the trend is modelled as a function of coordinates (Deutsch and Journel, 1992; Wackernagel, 1998). If the trend is defined externally, with some secondary variables, the term kriging with external drift (or trend) is used (Wackernagel, 1998; Chiles and Delfiner, 1999). The third approach consists in a regression modelling; the trend is modelled outside the kriging algorithm, followed by kriging of residuals. This was called regression-kriging (RK) by Odeh et al. (1994, 1995), while Goovaerts (1999) employed the term kriging after detrending. Another multivariate extension of kriging is the simple kriging (SK) with varying local means algorithm. In fact, it is similar to the kriging with external drift method, but has some advantages over it (Goovaerts, 1997). Besides the previously cited references, there are some others about the use of different geostatistical techniques to interpolate precipitation data. Mart´ınezCob (1996) obtained the best estimates using cokriging, including topography to improve predictions. PardoIg´uzquiza (1998) found the best results for the prediction of precipitation by means of kriging with an external drift. However, according to Goovaerts (1999), RK has Copyright 2009 Royal Meteorological Society
621
proven to be superior to simpler geostatistical methods. Therefore, several methods must be compared to establish the best technique to estimate precipitation in a particular area or region. More unanimity exists when deterministic and geostatistical algorithms are compared. The great majority of works shows better results when kriging, or any multivariate extension of it, is used. There are also some references about modelling climatological variables using geographical information systems (GIS). For example, Ninyerola et al. (2000) and Agnew and Palutikof (2000) integrate statistical and GIS techniques to make climatic maps. The linkage of GIS, statistics and geostatistics provides a complementary set of tools for spatial analysis (Burrough, 2001). In this paper, monthly and annual precipitation data from the Extremadura region (Spain) are interpolated to generate high-resolution maps, using two types of geostatistical methods: (1) algorithms that use only precipitation data recorded at the meteorological stations (ordinary kriging, OK and simple kriging, SK); (2) algorithms that combine precipitation data with auxiliary information (universal kriging, UK; ordinary cokriging, OCK; simple kriging, SK with varying local means, SKV; and regression-kriging, RK). Prediction performances of the algorithms are compared using cross-validation and the one with higher accuracy of estimates is selected to map precipitation. Thirteen maps were the outcome of this work: 12 maps of mean monthly precipitation and 1 of mean annual precipitation. Investigation of the reasons for different performance between approaches is also carried out.
2.
Site description
This work is centred in Extremadura (latitude between 37° 57 and 40° 29 N, longitude between 4° 39 and 7° 33 W). The region is located in the southwest of Spain on the Portuguese border. It is one of the largest regions in Europe, with a surface area of approximately 41 600 km2 , the size of Belgium. Extremadura shows a great contrast, with wide agricultural and forest areas, and is considered to be one of the most important ecological enclaves in Europe. In the north lie districts with gentle wooded hills, the Sierra de Gata and Hurdes, that through the fertile valleys of the Alag´on, Jerte and La Vera, link with the high Gredos mountains. In the east lie the irrigation lands of the river Tagus, the rugged Villuercas, and the areas of Los Montes and La Serena, with the longest interior coast in the Iberian Peninsula, that descend further to the south, to the agricultural areas of La Campi˜na. In the west the great plains of Brozas and Alc´antara drop to the San Pedro mountains and lead into the rich plain of the Guadiana river. In the south lie the great pasture lands and the mountains of Jerez, Tentud´ıa and Hornachos. The maximum and minimum altitudes in the region are 2091, at Gredos mountains, and 116 m a.s.l., in the Guadiana valley (near the border between Spain and Portugal), respectively. The mean altitude is about 425 m Int. J. Climatol. 30: 620–631 (2010)
622
F. J. MORAL
Figure 1. Location and topography of the study area.
a.s.l. Figure 1 shows the DEM, with a spatial resolution of 1 km, used in this research. The climate of Extremadura is characterized by a variation in both temperature and precipitation typical of a Mediterranean climate. However, this feature is modified by the interior location of the region and by oceanic influences that penetrate the peninsula due to its proximity to the Atlantic. Mean annual precipitation reaches