digital elevation model quality

DIGITAL ELEVATION MODEL QUALITY AND UNCERTAINTY IN DEM-BASED SPATIAL MODELLING

BRUCE HENDRY CARLISLE

A thesis submitted in partial fulfilment of the requirements of the University of Greenwich for the Degree of Doctor of Philosophy

September 2002

Declaration I certify that this work has not been accepted in substance for any degree, and is not concurrently submitted for any degree other than that of Doctor of Philosophy (PhD) of the University of Greenwich. I also declare that this work is the result of my own investigations except where otherwise stated.

Signed_______________________________________________________(student)

Signed_____________________________________________________(supervisor)

Abstract This thesis describes research undertaken to further understanding of the quality of Digital Elevation Models (DEMs), the nature of uncertainty in DEMs and how knowledge of uncertainty in DEMs can be incorporated in DEM-based spatial modelling applications. Despite increasing concern for understanding and working with the uncertainty within DEMs, knowledge about DEM quality is still at a primitive stage and incorporation of this knowledge into DEM-based modelling applications has only developed to a limited extent. The research presented here comprises three main areas.

First, a holistic approach to DEM quality assessment is developed. This holistic approach combines visual, geomorphometric and elevation accuracy quality assessment techniques. The quality of 26 DEMs, produced using varying data sampling patterns and interpolation methods, is assessed using these techniques. This allows an in-depth assessment of the causes and characteristics of DEM quality. It is shown that all three assessment techniques are required to give a comprehensive assessment of all aspects of DEM quality.

Second, this research addresses the limitations of using a single root mean squared error (RMSE) value to represent the uncertainty associated with a DEM by developing a new technique for creating a spatially distributed model of DEM quality – an accuracy surface. The technique is based on the hypothesis that the distribution and scale of elevation error within a DEM are at least partly related to morphometric characteristics of the terrain. The technique involves generating a set of terrain parameters to characterise terrain morphometry and developing regression models to define the relationship between DEM error and morphometric character. The regression models form the basis for creating standard deviation surfaces to represent DEM accuracy. The hypothesis is shown to be true and reliable accuracy surfaces are successfully created. These accuracy surfaces provide more detailed information about DEM accuracy than a single global estimate of RMSE.

Third, incorporating knowledge of DEM uncertainty into modelling applications is considered. It is shown that using a DEM accuracy surface, rather than a single RMSE value, in Monte Carlo simulations gives a better representation of uncertainty in DEM-

derived terrain parameters, such as slope gradient and aspect, upstream area and watershed delineation.

A key outcome of this research is a defined procedure for producing a comprehensive DEM quality report. An extension to the ArcView GIS package has been developed, which provides a set of tools for holistic DEM quality assessment, creating an accuracy surface and undertaking Monte Carlo simulation. The research presented in this thesis and the ArcView extension provides the knowledge and the means for incorporating consideration of DEM quality and associated uncertainty into DEM-based spatial modelling applications.

Acknowledgements Thanks go to my supervisors, Dr. Andy Bussell, Dr. Gesche Schmid and Dr. Melanie Smith, for all their help and advice. I am also grateful to Dr. Ian Heywood for his supervision in the early years, for helping come up with the initial ideas, and for generally making things happen. Thanks to Steve Carver for organising the expedition to Greenland, to Barnie and John for helping with fieldwork in Greenland and to Graham Smith for helping with fieldwork in Snowdonia. And last but not least thank you to Kye and the kids (Zuni and Djembe) for tolerating the grumpy bits and allowing me the time and space.

Contents LIST OF FIGURES ............................................................................................... I LIST OF TABLES ..............................................................................................IV CHAPTER 1: INTRODUCTION........................................................................... 1 1.1

Digital Elevation Models ......................................................................................2

1.2 Uncertainty ...........................................................................................................2 1.2.1 Assessing DEM Quality .................................................................................3 1.2.2 Modelling DEM Error ....................................................................................4 1.2.3 Applying Knowledge of DEM Uncertainty ...................................................4 1.3 The Need for Research into DEM Accuracy, Quality and Uncertainty ................5 1.4

Research Scope .....................................................................................................5

1.5 Research Aims ............................................................................................................6 1.6 Thesis Outline .............................................................................................................6 CHAPTER 2: DEMS, ACCURACY, QUALITY AND UNCERTAINTY ................ 8 2.1 Digital Elevation Models ............................................................................................8 2.1.1 Definitions .............................................................................................................8 2.1.2 The Benefits and Drawbacks of DEMs ...............................................................10 2.1.3 Generating DEMs ................................................................................................11 2.1.3.1 Photogrammetric Techniques .......................................................................12 2.1.3.2 Interpolating DEMs ......................................................................................13 2.1.3.3 LIDAR ..........................................................................................................14 2.1.4 DEM Applications...............................................................................................14 2.1.4.1 Surface Form ................................................................................................15 2.1.4.2 Surface Topology .........................................................................................16 2.1.4.3 DEM-based Environmental Indices .............................................................16 2.2 Quality, Accuracy, Error and Uncertainty ............................................................19 2.2.1 Quality .................................................................................................................19 2.2.2 Accuracy and Error .............................................................................................20 2.2.3 Uncertainty ..........................................................................................................21 2.3 DEM Quality .............................................................................................................22 2.3.1 Elevation Accuracy .............................................................................................23 2.3.2 Geomorphometric Characteristics .......................................................................23 2.3.3 Limitations of the Model .....................................................................................23 2.4 Causes of Reduced DEM Quality ...........................................................................24 2.4.1 Quality of Data Sources ......................................................................................25 2.4.1.1 Ground Survey Sources ................................................................................25

2.4.1.2 Photogrammetric Sources ............................................................................26 2.4.1.3 Cartographic Sources ...................................................................................28 2.4.2 Quality of Sample Distribution ...........................................................................31 2.4.2.1 Contour Line Sampling ................................................................................31 2.4.2.2 Regular Grid Sampling and Profiles ............................................................32 2.4.2.3 Progressive Sampling ...................................................................................33 2.4.2.4 Selective Sampling .......................................................................................34 2.4.2.5 Composite Sampling ....................................................................................35 2.4.2.6 Conversion of Sample Distribution ..............................................................35 2.4.3 Quality of Interpolation .......................................................................................36 2.4.3.1 Contour Line Interpolation ...........................................................................36 2.4.3.2 Point Interpolation ........................................................................................37 2.4.3.3 Inverse Distance Weighting .........................................................................40 2.4.3.4 Splines ..........................................................................................................42 2.4.3.5 Kriging .........................................................................................................43 2.4.3.6 Terrain Specific Interpolators .......................................................................44 2.4.3.7 Selecting an Interpolation Method ...............................................................44 2.4.4 Quality Considerations when Creating DEMs ....................................................45 CHAPTER 3: ASSESSING DEM QUALITY...................................................... 46 3.1 Rationale, Aim and Objectives ................................................................................46 3.2 Introduction to the Assessment of DEM Quality ..................................................48 3.2.1 Visual Assessment of DEM Quality ...................................................................48 3.2.1.1 Two-Dimensional DEM Rendering .............................................................48 3.2.1.2 Orthographic Display ...................................................................................49 3.2.1.3 DEM Derivatives ..........................................................................................49 3.2.2 Quality Assessment by Geomorphometric Characterisation ..............................50 3.2.3 Estimating DEM Accuracy .................................................................................52 3.2.3.1 Selecting Sample Points ...............................................................................53 3.2.3.2 Sources of More Accurate Data ...................................................................54 3.2.3.3 Accuracy Measures ......................................................................................56 3.3 Methodology .............................................................................................................60 3.3.1 Study Area ...........................................................................................................60 3.3.2 Digital Elevation Data .........................................................................................62 3.3.3 Software ..............................................................................................................62 3.3.4 Data Preparation ..................................................................................................63 3.3.5 DEM Generation .................................................................................................64 3.3.5.1 Interpolation Procedures...............................................................................65 3.3.5.2 The DEMs ....................................................................................................66 3.3.6 Visual Assessment of DEM Quality ...................................................................67 3.3.7 Quality of Geomorphometric Characteristics .....................................................68 3.3.8 DEM Accuracy ....................................................................................................69 3.3.8.1 Collection of GPS Measurements ................................................................70 3.3.8.2 DEM Accuracy Measures ............................................................................73 3.3.9 Comparison of Quality Assessment Approaches ................................................73 3.3.10 Comparison of DEM Quality ............................................................................74 3.4 Results .......................................................................................................................75 3.4.1 Visual Assessment...............................................................................................75 3.4.1.1 Elevation Renderings and Orthographic Views ...........................................75

3.4.1.2 Rendering Gradient and Aspect Images .......................................................83 3.4.2 Geomorphometric Characteristics .......................................................................88 3.4.3 DEM Accuracy ....................................................................................................92 3.4.4 Comparison of Quality Assessment Approaches ................................................94 3.4.5 Comparison of DEM Quality ..............................................................................96 3.5 Discussion ..................................................................................................................97 3.5.1 Assessing DEM Quality ......................................................................................97 3.5.1.1 Visual Assessment........................................................................................97 3.5.1.2 Geomorphometric Indices ............................................................................98 3.5.1.3 Accuracy Measures ......................................................................................98 3.5.1.4 A Comprehensive DEM Quality Report ......................................................99 3.5.2 The Quality of the DEMs ..................................................................................102 3.5.2.1 Data Distribution Issues .............................................................................102 3.5.2.2 Interpolation Issues ....................................................................................105 3.5.2.3 Software Issues ...........................................................................................107 3.6 Conclusions .............................................................................................................108 CHAPTER 4: MODELLING THE SPATIAL DISTRIBUTION OF DEM ERROR ........................................................................................................................ 110 4.1 Rationale, Aim and Objectives ..............................................................................110 4.2 The Spatial Distribution Of DEM Error..............................................................111 4.2.1 The Relationship between DEM Errors and Terrain .........................................111 4.2.2 Modelling the Distribution of DEM Errors .......................................................112 4.2.2.1 Spatial Correlation......................................................................................112 4.2.2.2 Spatially Distributed DEM Error Models ..................................................113 4.3 Methodology ...........................................................................................................115 4.3.1 Study Areas .......................................................................................................115 4.3.2 Data ...................................................................................................................117 4.3.2.1 DEMs .........................................................................................................117 4.3.2.2 Measurements of DEM Error .....................................................................118 4.3.4 Deriving Terrain Parameters .............................................................................118 4.3.5 Initial Investigations of the Relationship ..........................................................121 4.3.6 Deriving Additional Terrain Parameters ...........................................................121 4.3.6.1 Percentage Gradient ...................................................................................122 4.3.6.2 Mean Filtering ............................................................................................122 4.3.6.3 Standard Deviation Filtering ......................................................................123 4.3.6.4 Polynomials ................................................................................................ 123 4.3.7 Modelling the Error-Terrain Relationship.........................................................123 4.3.7.1 Stepwise Regression Modelling .................................................................124 4.3.7.2 Generating Error Surfaces ..........................................................................125 4.3.8 Model Validation............................................................................................... 125 4.4 Results for Snowdonia ...........................................................................................126 4.4.1 Correlations .......................................................................................................126 4.4.2 Regression Modelling .......................................................................................128 4.4.3 Model Validation............................................................................................... 131 4.4.3.1 Predicted Errors ..........................................................................................131 4.4.3.2 Error Surface Characteristics .....................................................................132

4.4.3.3 Visual Assessment......................................................................................135 4.5 Results for Mestersvig ............................................................................................136 4.5.1 Correlations .......................................................................................................136 4.5.2 Regression Modelling .......................................................................................137 4.5.3 Model Validation............................................................................................... 139 4.5.3.1 Predicted Errors ..........................................................................................139 4.5.3.2 Error Surface Characteristics .....................................................................140 4.5.3.3 Visual Assessment......................................................................................144 4.6 Discussion ................................................................................................................145 4.6.1 The Relationship between DEM Error and Terrain Character ..........................145 4.6.2 The Quality of Error and Accuracy Surfaces ....................................................145 4.6.3 GPS Sample Point Issues ..................................................................................146 4.6.4 Choice of Terrain Parameters ............................................................................147 4.6.5 Quality of Terrain Parameters ...........................................................................148 4.6.6 Differences between the Error and Accuracy Surfaces .....................................149 4.7 Conclusions .............................................................................................................149 CHAPTER 5: APPLYING KNOWLEDGE OF DEM ACCURACY TO UNCERTAINTY MODELLING ........................................................................ 151 5.1 Aim and Objectives ................................................................................................ 151 5.2 Introduction to Modelling Uncertainty ................................................................ 152 5.2.1 Techniques for Assessing and Managing Uncertainty ......................................153 5.2.1.1 Epsilon Bands .............................................................................................154 5.2.1.2 Error Propagation .......................................................................................155 5.2.1.3 Fuzzy Sets and Fuzzy Logic .......................................................................155 5.2.1.4 Monte Carlo Simulation .............................................................................156 5.3 Research Rationale .................................................................................................158 5.4 Methodology ...........................................................................................................159 5.4.1 Development of Monte Carlo Simulation Tools ...............................................160 5.4.1.1 Determine the Number of Simulations .......................................................160 5.4.1.2 Creating Random Fields .............................................................................162 5.4.1.3 Monte Carlo Simulation .............................................................................162 5.4.2 Comparison of Simulations Based on Global and Local Accuracy ..................167 5.5 Results .....................................................................................................................167 5.5.1 Elevation............................................................................................................167 5.5.2 Gradient .............................................................................................................170 5.5.3 Upstream Area...................................................................................................173 5.5.4 Topographic Index ............................................................................................175 5.5.5 Watershed ..........................................................................................................178 5.6 Issues Raised by the Comparison of Global and Local Simulation Results .....180 5.6.1 The Quality of the Simulations .........................................................................180 5.6.2 Usefulness of the Simulation Results ................................................................ 182 5.6.3 Applying Uncertainty Knowledge ....................................................................182 5.6.4 Other Approaches to Modelling Uncertainty ....................................................184

5.7 Conclusions .............................................................................................................185 CHAPTER 6: ADVANCES IN THE UNDERSTANDING AND HANDLING OF DEM QUALITY AND UNCERTAINTY ............................................................ 187 6.1 Methodologies and Tools for Assessing DEM Quality ........................................187 6.1.1 Geomorphometric Quality Assessment .............................................................188 6.1.2 Accuracy Assessment ........................................................................................189 6.1.3 A DEM Quality Report .....................................................................................190 6.2 Systematic Investigation of the Causes of Reduced DEM Quality ....................191 6.3 Modelling Uncertainty in DEM-derived Topographic Variables ......................192 6.3.1 Monte Carlo Simulation ....................................................................................192 6.3.2 Uncertainty in Topographic Variables ..............................................................192 6.4 Further Work .........................................................................................................193 6.4.1 Quality Assessment Work .................................................................................193 6.4.2 Uncertainty Modelling Work ............................................................................194 REFERENCES ................................................................................................ 196 APPENDIX 1: KRIGING TRIAL

206

APPENDIX 2: THE DEM UNCERTAINTY AND QUALITY ARCVIEW EXTENSION

207

APPENDIX 3: MESTERSVIG TOPOGRAPHIC MAP

223

APPENDIX 4: SNOWDON 13 CORRELATIONS

224

APPENDIX 5: SNOWDON 225 CORRELATIONS

225

APPENDIX 6: MESTERSVIG 13 CORRELATIONS

228

APPENDIX 7: MESTERSVIG 225 CORRELATIONS

229

APPENDIX 8: DEM QUALITY REPORT

232

APPENDIX 9: CD ROM

233

List of Figures Fig. 2.1 DEM and TIN visualisations

10

Fig. 2.2 Production process for the Ordnance Survey‟s Landform Profile digital contour data

30

Fig. 2.3 Patch selection procedures

39

Fig. 2.4 Dividing the search area into quadrants and selecting two points from each quadrant

40

Fig. 3.1 Raster rendering of a DEM

48

Fig. 3.2 Orthographic displays

49

Fig. 3.3 Rendered DEM derivative maps

50

Fig. 3.4 Identifying DEM terracing

51

Fig. 3.5 Mean error and accuracy

58

Fig. 3.6 The Snowdonia study area

61

Fig. 3.7 The DEM Uncertainty and Quality menu

69

Fig. 3.8 GPS Fieldwork

72

Fig. 3.9 Locations of Area 1 and Area 2

76

Fig. 3.10 DEM produced using spline with tension interpolation and a 12 point radius

77

Fig. 3.11 DEM produced using linear interpolation of 10m contours

78

Fig. 3.12 DEM produced using linear interpolation of 50m contours

79

Fig. 3.13 DEM produced from all contour vertices using Idrisi‟s INTERPOL routine with a weighting of 2

80

Fig. 3.14 View of Area 1 for DEM produced from all contour vertices using Idrisi‟s INTERPOL routine with a weighting of 5

81

Fig. 3.15 DEM produced from regular 50m grid of points using Idrisi‟s INTERPOL routine with a weighting of 2

82

Fig. 3.16 Derivatives of DEM produced using spline with tension and a 12 point search radius

83

Fig. 3.17 Derivatives of DEM produced using linear interpolation of 10m contours

84

List of Figures

i

Fig. 3.18 Derivatives of DEM produced using linear interpolation of 50m contours

85

Fig. 3.19 Derivatives of DEM produced from all contour vertices using inverse distance weighting with a weight of 2

86

Fig. 3.20 Derivatives of DEM produced from all contour vertices using inverse distance weighting with a weight of 5

87

Fig. 3.21 Derivatives of DEM produced from a regular 50m grid of points using inverse distance weighting with a weight of 2

88

Fig. 3.22: Potential storage of a DEM‟s quality report

101

Fig. 4.1 Location of the Mestersvig study area

116

Fig. 4.2 Digitised contours for the Mestersvig site

118

Fig. 4.3 The aspect vector concept

121

Fig. 4.4 The non-linear relationship between gradient measured in percent and gradient measured in degrees

122

Fig. 4.5 Snowdonia SpTen12‟s adjusted R2 values plotted against number of variables used in stepwise regression modelling

125

Fig. 4.6 Terrain parameters ranked according to strength of correlation and plotted against correlation coefficient

128

Fig. 4.7 Cumulative frequency distribution for standard deviation of error surfaces

134

Fig. 4.8 Standard deviation (SD) surfaces draped over orthographic views of the corresponding Snowdonia DEM

135

Fig. 4.9 Terrain parameters ranked according to strength of correlation and plotted against correlation coefficient

137

Fig. 4.10 Cumulative frequency distribution for standard deviation of error surfaces

143

Fig. 4.11 Standard deviation (SD) surfaces draped over orthographic views of the corresponding Mestersvig DEM

144

Fig. 5.1 Epsilon bands

154

Fig. 5.2 The Monte Carlo simulation procedure

157

Fig. 5.3 Suitable number of realisations determined from a graph of N plotted against variability

161

Fig. 5.4 RMSE of elevation

169

Fig. 5.5 Relative RMSE of elevation

170

List of Figures

ii

Fig. 5.6 RMSE of gradient

172

Fig. 5.7 Relative RMSE of gradient

173

Fig. 5.8 RMSE of upstream area

175

Fig. 5.9 RMSE of topographic index

177

Fig. 5.10 Probability of being within the Noret watershed

179

Fig. 5.11 An illustration of the relationship between uncertainty and terrain character

181

Fig. 5.12 Probability-based watersheds

183

Fig. A2.1 The DEM Uncertainty and Quality menu

207

List of Figures

iii

List of Tables Table 2.1 DEM derivatives and their application

18

Table 2.2 Factors Influencing Quality

20

Table 3.1 Use of momental statistics to describe DEM accuracy

57

Table 3.2 Generated DEMs

67

Table 3.3 Geomorphometric quality indices

89

Table 3.4 DEM Accuracy Measures

93

Table 3.5 Correlation between quality measures

94

Table 3.6 Results of stepwise regression analysis of quality measures and geomorphometric and accuracy scores

95

Table 3.7 DEM Quality Measures and Scores

96

Table 3.8 Comparison of scores and ranks for different sampling patterns

102

Table 3.9 Comparison of scores and ranks for different contour intervals

103

Table 3.10 Comparison of scores and ranks for different numbers of contour vertices

104

Table 3.11 Comparison of scores and ranks for different grid spacings

105

Table 3.12 Comparison of scores and ranks for different interpolation methods

106

Table 3.13 Comparison of scores and ranks for Idrisi and ArcView

108

Table 4.1 Terrain parameters

119

Table 4.2 Snowdonia‟s most significant correlations and number of significant correlations of elevation error with the initial 12 terrain parameters

127

Table 4.3 Snowdonia‟s most significant correlations and number of significant correlations of elevation error with all 225 terrain parameters

127

Table 4.4 Results of regression modelling for Snowdonia

129

Table 4.5 Regression equation variables for Snowdonia

130

Table 4.6 Distribution of actual and corrected errors for Snowdonia

131

Table 4.7 Summary statistics for error surfaces and mean error surfaces of Snowdonia

132

Table 4.8 Summary statistics for accuracy surfaces of Snowdonia

133

List of Tables

iv

Table 4.9 Mestersvig‟s most significant correlations and number of significant correlations of elevation error with the initial 12 terrain parameters

136

Table 4.10 Mestersvig‟s most significant correlations and number of significant correlations of elevation error with all 225 terrain parameters

136

Table 4.11 Results of regression modelling for Mestersvig

138

Table 4.12 Regression equation variables for Mestersvig

138

Table 4.13 Distribution of actual and corrected errors for Mestersvig

140

Table 4.14 Summary statistics for error surfaces and mean error surfaces of Mestersvig

141

Table 4.15 Summary statistics for accuracy surfaces of Mestersvig

142

Table 5.1 Monte Carlo Simulation Summary Statistics Grids

163

Table 5.2 Summary Statistics for RMSE Values for Elevation Uncertainty Simulation

169

Table 5.3 Summary Statistics for RMSE Values for Gradient Uncertainty Simulation

171

Table A1.1 Kriging geomorphometric indices

206

Table A1.2 Kriging accuracy measures

206

Table A2.1 Scripts associated with DEM Uncertainty and Quality menu items

208

List of Tables

v

Chapter 1: Introduction The term Geographic Information System (GIS) refers to a type of software, which is used to describe, understand and model the spatial distribution of phenomena across the earth‟s surface and the natural and anthropogenic processes acting upon these phenomena. The GIS acronym is also used for Geographic Information Science, which is the discipline concerned with how GISystems are applied to the description, understanding and modelling of spatial phenomena and processes.

Spatial data, also known as geographic data, lie at the heart of any GIS. Spatial data describe the location and characteristics of earth surface features and phenomena. A digital elevation model (DEM) is a type of spatial data set, which describes the elevation of the land surface. The height and form of terrain have a fundamental influence on most environmental phenomena. Consequently, DEMs are widely used in environmental applications of GIS (Moore et al., 1991; Stocks & Heywood, 1994; Weibel & Heller, 1991).

Spatial analysis is the suite of methods that can be applied to transform spatial data into information describing spatial patterns, spatial relationships and spatial processes. Longley et al. (2001, p.278) describe spatial analysis as the “crux of GIS, the means of adding value to geographic data, and of turning data into useful information.” Spatial modelling is the application of a series of spatial analysis techniques, which together lead to the derivation of new spatial data representing an earth surface phenomenon.

Spatial modelling often achieves only limited success due to the quality of source data. Digital elevation data, and other spatial data sets, are subject to inherent errors (Clark, 1993; Fisher, 1994). Weibel & Brändli (1995) urge users to appraise the quality of their elevation data and the derived Digital Elevation Models (DEMs). Data sets derived from DEMs have been found to be very sensitive to the quality of the DEM (Wood, 1994). Therefore, having made a DEM quality assessment the user needs to consider the influence of DEM quality on derived products and models (Miller and Morrice, 1996). Despite increasing concern for understanding and working with the uncertainty within DEMs, knowledge about DEM error is still at a primitive stage and incorporation of this

Chapter 1 – Introduction

1

knowledge into DEM-based modelling applications has only developed to a limited extent (Heywood, et al., 1998).

1.1

Digital Elevation Models

Within the context of Geographical Information Systems (GIS), several types of surface model may be used in a spatial modelling environment, including DEMs, Digital Terrain Models (DTMs) and Triangulated Irregular Networks (TINs). The study presented here is concerned solely with DEMs, because they are the type of surface model most commonly used for spatial modelling applications and particularly environmental modelling (Stocks & Heywood, 1994; Band, et al., 1995; Weibel & Brändli, 1995; Garbrecht & Martz, 1999). A DEM can be defined as a regular two-dimensional array of height values describing the varying elevation of an area‟s terrain (Burrough & McDonnell, 1998).

Information about the terrain surface plays a key role in nearly all environmental research including hydrology, geomorphology, ecology and other disciplines (Theobald, 1989; Stocks & Heywood, 1994; Garbrecht & Martz, 1999). Therefore a DEM is a fundamental requirement for many GIS applications, both directly due to the influence of elevation on many environmental phenomena and indirectly due to the influence of variables derived from a DEM such as gradient and aspect on environmental phenomena and processes (Stocks & Heywood, 1994). For any environmental modelling application, which involves a DEM, the quality of the DEM will play a key role in determining the quality of the model outputs.

1.2

Uncertainty

By definition a model is an approximation of reality. A spatial data set, such as a DEM, is a model of real world features or phenomena. A series of spatial analysis techniques used to manipulate and process data sets in spatial modelling represents a computational model of real world processes and relationships. Both data models and computational models reduce the complexities of the real world through simplification, reduction and generalisation. The goal of a spatial modelling project may be to produce as close an approximation of reality as possible. In other instances a more abstract or less realistic representation of reality is a more useful goal as this removes confusing details to reveal


2

the most fundamental factors, processes or trends. Whatever the goal there will be a lack of knowledge of: how well a model represents reality; how good a representation of reality is required; and therefore, how reliable the conclusions are which can be drawn from the model. This lack of knowledge can be expressed as uncertainty.

In the context of spatial modelling, uncertainty has two components: uncertainty about the quality of data and uncertainty about the quality of processing applied to the data (Openshaw, et al., 1991; Goodchild, et al., 1994). This research is concerned only with the data quality aspects of uncertainty – namely the quality of DEMs. Data quality is used to describe how adequately a data set represents reality (Ehlschlaeger & Goodchild, 1994; Wood, 1994). Data quality is fundamentally dependent on the degree of error within a data set. Error is the difference between a data set value and the true value. Accuracy is a measure that aggregates or summarises the errors within a data set. All GIS data sets are to some extent contaminated by error (Goodchild & Gopal, 1989; Openshaw, et al., 1991; Heuvelink & Burrough, 1993; Burrough & McDonnell, 1998). However, it should be noted that accuracy is one of a number of data characteristics that together determine the overall quality of a data set (Burrough & McDonnell, 1998).

1.2.1 Assessing DEM Quality Assessment of DEM quality is commonly restricted to reporting a Root Mean Square Error (RMSE) value. For example, in the UK, the Ordnance Survey‟s digital contour data have a quoted accuracy of +/- 1.0m to 1.8m RMSE (Ordnance Survey, 1999). The United States Geological Survey (USGS) describes the accuracy of its 7.5 minute DEMs with one RMSE value for each quadrangle or tile. This RMSE value is based on the difference between DEM elevation and the elevation of test points measured by field survey or aerotriangulation, or from a spot height or point on a contour line from an existing source map (USGS, 1998). The RMSE for each quadrangle is calculated from 28 test points. Such quality assessments have limitations associated with the RMSE measure itself and with the way the measure is derived..

The assessment is restricted to just estimating elevation error and only describing the frequency distribution of this error, the DEM‟s accuracy, with one of a number of


3

momental statistics – the RMSE. Other momental statistics, visual examination of the surface model and statistics summarising key morphometric characteristics of the DEM can provide valuable additional information about the quality of a DEM. Also, a single nationwide figure in the case of the Ordnance Survey or a quadrangle-wide figure in the case of the USGS is a global estimate, which does not reflect the spatially varying nature of DEM error. It is known, when deriving a DEM from stereo pairs of aerial photographs, that errors are likely to be larger on shaded slopes and on smooth, featureless terrain. Empirical evidence of Carter (1988), Guth (1992), Wood (1993) and Monckton (1994) indicates that DEM error is spatially variable and spatially autocorrelated. This knowledge of the variation in error is not communicated by the single RMSE value.

In terms of how the RMSE is derived, it is of limited use because it often describes the accuracy of the source elevation data, e.g. contour lines and spot heights, rather than the derived DEM. Also, the accuracy estimate often does not relate to true elevation, but the elevation recorded in another data source.

1.2.2 Modelling DEM Error Given that DEM error has been observed as spatially variable and autocorrelated, an accuracy surface would give a better representation of the distribution of error within a DEM than a single RMSE value. As this accuracy surface would be a model of DEM error, it would be sensible for the surface model used to represent this accuracy surface to take the same form as the DEM, i.e. a two dimensional array of accuracy values. Certain types of terrain – steep slopes, shaded slopes and featureless terrain - seem more likely to be associated with greater DEM errors. So it can be hypothesised that the distribution and scale of errors within a DEM are at least partly related to morphometric characteristics of the terrain. Identifying the relationship between DEM error and morphometric characteristics, such as gradient, curvature and relative relief, could form the basis for creating a DEM accuracy surface.

1.2.3 Applying Knowledge of DEM Uncertainty Single global estimates of DEM accuracy have been used to visualise DEM quality by showing “epsilon bands” around contour lines and catchment boundaries, e.g. Chrisman


4

(1983), and by using stochastic techniques, such as Monte Carlo simulation to derive a sample of potentially valid model outputs reflecting the influence of DEM uncertainty on the modelling process (Openshaw, et al., 1991; Fisher, 1994; Monckton, 1994; Miller and Morrice, 1996). A spatially distributed model of DEM accuracy would provide a more detailed and reliable basis for stochastic simulations than a single RMSE value.

1.3 The Need for Research into DEM Accuracy, Quality and Uncertainty The quality of spatial data and uncertainty about the use of these data in GIS-based analysis and modelling applications are issues that have received attention mainly since the 1990s. However, there is still not enough known about the quality of spatial data sets and how this quality influences spatial modelling outcomes. This is a particularly important issue in the case of DEMs because of their widespread use and the variety of data sets that are derived. Holistic and systematic studies of DEM quality are lacking.

DEM quality is a complex issue (see Chapter 2). There are a number of aspects to DEM quality: elevation accuracy, geomorphometric characteristics, and model limitations. A number of factors influence DEM quality: data source, sampling pattern, sampling density, and interpolation method.

There is a need for thorough and comprehensive investigation of the quality of DEMs, both in terms of the accuracy of elevation values and the quality of geomorphometric characteristics of DEMs. Building on this investigation, work is needed to determine how DEM quality affects the uncertainty of modelling results. The work presented in this thesis addresses these two tasks.

1.4

Research Scope

The above sections have alluded to some boundaries to the scope of this research. It is worth clearly defining the scope of the research here.

First, this research is confined to the study of DEMs and not other forms of surface model. This restriction is necessary because the different types of surface model are derived from differing data sources, using differing methods to create the models and subsequently tend to be applied to different types of application. These variations mean Chapter 1 – Introduction

5

that uncertainty associated with each model varies in nature and extent. Applying this research to more than one form of surface model would therefore be unfocused.

Second, this research is conducted in the context of GIS-based environmental modelling of mountainous regions. DEMs, and other types of surface model, are also applied in fields as diverse as civil engineering and scientific visualisation. However, each application field tends to use DEMs of a certain scale or resolution, derived from particular data sources and modelled with particular software. Restricting the study to DEMs used for environmental modelling of mountain regions improves the focus of the research.

Third, this research only considers the data quality component of uncertainty associated with the use of DEMs. Furthermore significant portions of this thesis concentrate on DEM error rather than the other data quality and process quality issues which cause uncertainty in the results of DEM-based modelling applications.

1.5 Research Aims The overall aim of this research is to further understanding of DEM uncertainty and improve the incorporation of uncertainty knowledge into DEM-based modelling applications.

In fulfilling this aim, three subsidiary goals have been defined: To assess the quality of DEMs; To derive a spatially distributed model of DEM accuracy; and, To enhance the ability to assess uncertainty about the outcomes of DEM-based modelling applications.

1.6 Thesis Outline Chapter 2 of this thesis examines the topics, concepts and issues that are central to this research. These are DEMs, quality, accuracy and uncertainty. The following three chapters present research relating to each of the three subsidiary goals listed in §1.5. Each of these three chapters provides background information relevant to the particular area of work concerned, defines the aim and objectives of the research, describes the


6

methodology, presents and discusses the results and draws conclusions. First, Chapter 3 describes a systematic investigation into DEM quality and a method for comprehensively assessing DEM quality is presented. Second, a new method for building a model of DEM accuracy is presented and tested in Chapter 4. Third, in Chapter 5, ways of using this accuracy model are developed and tested, which improve understanding of how DEM quality affects uncertainty in modelling outputs and give a representation of the level of uncertainty. Chapter 6 draws together the main findings of the research and examines areas where further research is required.


7

Chapter 2: DEMs, Accuracy, Quality and Uncertainty The literature reviewed in this chapter covers four main topics: DEMs; quality, accuracy, error and uncertainty; DEM quality; and, causes of reduced DEM quality. Other relevant literature is reviewed in Chapter 3 – Assessing DEM Quality, Chapter 4 – Modelling DEM Error and Chapter 5 – Applying Knowledge of DEM Uncertainty.

2.1 Digital Elevation Models The height and shape of the terrain surface have an important influence on the spatial distribution of environmental phenomena and the nature of environmental processes acting at a location. Therefore, most environmental spatial modelling projects require a spatial data set that represents the terrain surface. The terrain surface is a continuous phenomenon – everywhere has an elevation either above or below sea level. However, we are only able to record that elevation at a limited number of discrete locations. Continuous phenomena cannot be represented in their entirety. They must be approximated and/or sampled (Goodchild, et al., 1994) to create a model of the surface.

2.1.1 Definitions In the context of GIS, three types of terrain model are commonly used: Digital Elevation Models (DEMs); Triangulated Irregular Networks (TINs); and, Digital Terrain Models (DTMs). In the literature, there is some inconsistency in the use of this nomenclature (Burrough, 1986; Weibel & Heller, 1991; Wood, 1996).

Many use the terms DEM and DTM interchangeably. DTM is used by some for any model of the elevation of the terrain surface (Carrara, et al., 1997; Heywood et al., 1998). Wood (1996) defines a DTM as a surface model that includes explicit representation of surface form by means of ridgelines, valley lines, form lines and spot heights. Here, in common with Schmid-McGibbon & Eyton (1996), a DTM is defined as

Chapter 2 – DEMs, Accuracy, Quality and Uncertainty

8

any digital data set that describes some attribute of the terrain surface, such as elevation, slope gradient or slope aspect. A DEM and other data sets derived from the DEM, such as gridded gradient maps, are types of DTM.

The term DEM is also used by some for any model of the elevation of the terrain surface (Burrough & McDonnell, 1998; Gao, 1997). Here, in common with Wood (1996), the term DEM is used to describe a specific form of terrain model, namely a raster data set, representing the elevation of the terrain surface as a two dimensional array or matrix of height values. The DEM matrix is visualised by colouring each element of the matrix, or cell, according to its elevation value from a palette of graded colours or grey levels (Fig. 2.1a). An orthographic pseudo-perspective representation can be achieved by draping the colour map over an extruded representation of the surface (Fig. 2.1b).

A TIN is a vector-based form of representing elevation. Irregularly distributed points at which elevation is recorded are joined to form a mosaic of irregularly shaped triangular facets (Fig. 2.1c). The triangular facets can be shaded in the same way as DEM cells according to their mean elevation and can also be displayed orthographically (Fig. 2.1d).

A fourth term, Digital Surface Model (DSM), is becoming more commonly used to describe a model of an upper surface that includes vegetation and buildings, as opposed to a DEM that is intended to model the bare earth surface. There can be situations where a DEM records the elevation of buildings or vegetation, such as a forest canopy, rather than the ground elevation, particularly when the DEM is photogrammetrically derived. However, it is becoming increasingly important to distinguish between DEMs and DSMs as derivation of elevation models from remote sensing becomes more common and new techniques such as LIDAR (§2.1.3.3) are able to discern several surfaces (Lillesand & Kiefer, 2000).


9

a)

b)

c)

d)

Fig. 2.1 DEM and TIN Visualisation: a) Planimetric DEM view; b) orthographic DEM view; c) planimetric view of TIN vertices and breaklines; d) orthographic view of TIN vertices, breaklines and shaded facets.

2.1.2 The Benefits and Drawbacks of DEMs Only DEMs are considered in this research. In GIS-based spatial modelling, they are the most commonly used form of terrain surface model (Martz & Garbrecht, 1992). DEMs were first developed and used in the 1970s and became popular because the simple matrix of elevation values was easily processed (Evans, 1972). More recently the popularity of DEMs has been reinforced because they are easily used in a GIS environment (Wood, 1996). DEMs can be easily used to derive a number of other


10

terrain-related data sets, such as slope gradient, flow direction and wetness index (Burrough & McDonnell, 1998) (see §2.1.4).

The use of DEMs can have disadvantages. There can be a high level of data redundancy in areas of uniform terrain (Fairfield & Leymarie, 1991; Garbrecht & Martz, 1999). A large number of raster cells may be describing a simple planar slope or flat area. A DEM cannot adapt to different terrain complexities (Garbrecht & Martz, 1999). The data redundancy problem could be resolved by increasing the grid spacing, but the grid spacing for the whole DEM would have to be increased and valuable information in a less uniform area could be lost. Sampling theory indicates that a DEM only represents features or terrain heterogeneity whose size or spatial period is twice that of the grid spacing (Blais et al., 1986; Muehrcke, 1972, cited Carter, 1988; Pike, 1988; Theobald, 1989; Weibel & DeLotto, 1988). Thus the fidelity with which a DEM represents an area of terrain depends on the scale of the DEM, as represented by its resolution, and the scale of the landforms being modelled.

TINs avoid the data redundancy problem and can adapt to varying terrain complexity, but the range of types of analysis for which they can be used is much more limited than for DEMs (Burrough & McDonnell, 1998; Garbrecht & Martz, 1999). Additionally, the lack of data redundancy may not give rise to smaller file sizes, as one would expect. A DEM only needs to store elevation values, as the geographical location of a raster cell is implicit in its position in the array (Wood, 1996). A point in a TIN must store its x and y coordinates, its elevation and the identifiers of the other points to which it is connected. Although efficient in terms of number of points, the TIN has a high data volume per point. Consequently a DEM can give more detail than a TIN of the same file size (Gao, 1998). For an in-depth review and discussion of the differences between DEMs and TINs see Kumler (1994).

2.1.3 Generating DEMs There are two main approaches to generating a DEM: interpolating the regular grid from an irregularly distributed elevation data set, or generating the grid directly from remotely sensed imagery using photogrammetric techniques. Currently, use of a third technique, Light Detection and Ranging (LIDAR) is becoming increasingly common.


11

2.1.3.1 Photogrammetric Techniques Photogrammetry provides the most frequently used data sources and techniques for generating DEMs (Stocks & Heywood, 1994), either by direct generation of DEMs or indirectly via its use in topographic mapping for production of contour lines (see §2.4.1.2). Photogrammetry either involves stereoscopic techniques for interpretation of aerial photography or digital image correlation applied to aerial photographs.

The underlying principle of photogrammetric techniques is that a change in the elevation of the photographed surface causes a radial shift in the apparent planimetric location of that surface's features. This radial shift results in a parallax error. Overlapping photographic coverage, in which every location is recorded in two or more photographs, each taken from a different location, can be used to measure this radial shift and hence derive the elevation of a point (Campbell, 1991; Petrie, 1990a). This measurement can be made using an analytical stereoplotter to view the photography in stereo and record the elevation at each node of the output grid in turn. Alternatively, digital image correlation techniques automatically match up individual pixels from overlapping image pairs and calculate the height from the parallax shift (Allen & Shears, 1995; Ebner, 1992; Fukushima, 1988).

Use of photogrammetric sources has several advantages, particularly in mountain environments:

1. Remote sensing avoids the need to gain overland access to the area being surveyed across potentially remote and hazardous terrain (Stocks & Heywood, 1994); 2. Data with a wide areal coverage can be acquired in a relatively short time (Petrie, 1990a); 3. There is no need for accurate topographic maps to be available (Petrie, 1990a).

However, there are also a number of potential problems associated with obtaining aerial photography in mountainous areas, including: 1. Frequent cloud cover and lengthy snow cover limiting the opportunities for good photography may reduce the areal coverage, currency or clarity of aerial photographs;


12

2. The military or political sensitivity of a number of mountain regions may restrict the availability of data, or reduce the opportunities for flying aerial photography; 3. The high relief of mountain topography may cause detail to be unclear in shaded areas or totally obscured from view; 4. Dense tree cover can obscure the ground surface (Weibel & Brändli, 1995).

These problems may mean that no elevation data can be derived, or measurements are of lower accuracy, or the elevation model actually records the height of another surface, such as a forest canopy. The problems can be mitigated by acquiring additional photography with less cloud, snow or shadowing, collecting extra data, such as undertaking a ground survey in forested areas, or further processing, such as subtracting average tree height from forested areas (Lillesand & Kiefer, 2000).

2.1.3.2 Interpolating DEMs In many situations directly generating a DEM from aerial photography is not a practical option due to the costs of purchasing suitable photography, software and hardware, nonexistence of suitable photography or lack of suitably experienced personnel (Stocks & Heywood, 1991). The alternative is to interpolate the regular grid of a DEM from another elevation data set.

Topographic maps are a widely available data source in which elevation is represented by contour lines and spot heights. Ground survey techniques using total stations (electronic tacheometers) or Global Positioning System (GPS) receivers can also be used to collect elevation data. Measurements may be made either at a randomly distributed set of point locations or in a more structured pattern such as at equal intervals along transects.

The irregularly distributed contour lines, spot heights or profiles can be interpolated to create the DEM‟s regular matrix of elevation values. Interpolation can also be used to decrease the grid spacing of photogrammetrically derived DEMs if their resolution is too coarse. Burrough & McDonnell (1998, p. 98) define interpolation as “the procedure of predicting the value of attributes at unsampled sites from measurements made at point locations within the same area or region.” In terms of generating DEMs, the measurements made at point locations are spot heights, points along a transect or the vertices of contour lines. There are many interpolation algorithms for generating surface


13

models from point data, such as inverse distance weighting, spline fitting and Kriging, that have been reviewed and examined elsewhere (Burrough & McDonnell, 1998; Lam, 1983), and others that interpolate from contour lines (Carrara et al., 1997).

The above methods for generating DEMs are examined in more depth in §2.4 where their influence on DEM quality is considered.

2.1.3.3 LIDAR LIDAR is a remote sensing technique that involves emitting pulses of laser light from an aircraft or satellite towards the ground and measuring the return time. A LIDARequipped aircraft carries a pulsing laser, airborne GPS for determining sensor location, an Inertial Measuring Unit for measuring orientation of the sensor to the ground, a high accuracy clock, substantial computing power to process data in real time and data storage equipment (Lillesand & Kiefer, 2000).

There are several advantages to LIDAR, which are encouraging its use for deriving DEMs (Lillesand & Kiefer, 2000). First, as with other remote sensing techniques, data can be captured in remote and otherwise inaccessible locations. Second, unlike aerial photography, data can be collected in steep, shadowed areas and at any time of day or night. Third, the laser pulses can penetrate vegetation. Therefore, one pulse can produce multiple returns reflected off different surfaces such as the vegetation layers in a forest. Fourth, a high density of high accuracy elevation measurements can be generated. Consequently, highly detailed and accurate DEMs and DSMs can be derived.

The two main drawbacks of LIDAR are, first, that the large amount of sophisticated equipment required for data collection makes LIDAR data sets expensive and, at the time of writing, are only available for specific areas of developed countries. Second, the large data volumes can be cumbersome to handle, and separating multiple returns into different surface layers requires specialist software and expertise.

2.1.4 DEM Applications DEMs are widely used in many fields of research and examples in the literature are numerous. This section does not attempt to review the use of DEMs in all these fields,


14

but rather examine the type of information that can be derived from a DEM. Comprehensive reviews and bibliographies of the application of DEMs to hydrology, geomorphology, glaciology, landscape analysis, vegetation studies, conservation management and other fields include Gurnell and Montgomery (2000), Maidment (1996), Moore et al. (1991) and Price and Heywood (1994).

Elevation is just one of a number of properties of the terrain of an area which influence the distribution of environmental phenomena and the nature of environmental processes. The most important of these terrain properties can be grouped as attributes of surface form or surface topology and various environmental indices, which can be calculated by combining surface form and surface topology measures.

2.1.4.1 Surface Form The shape of the terrain surface influences the distribution of many environmental phenomena and can act as a control on environmental processes. Characterising the shape of the terrain surface is important to many environmental modelling applications (Burrough & McDonnell, 1998). Geomorphometry is the measurement or quantification of surface form (Wood, 1996). Various authors have presented a large number of geomorphometric measures, many of which are closely related and overlapping in terms of the surface form attributes they describe. For example, Nogami (1995) describes 37 such measures. Evans (1972) presents a synthesised and standardised set of measures, which provide a comprehensive description of surface form, while minimising redundancy. Evans‟ five morphometric parameters are elevation, the first derivatives of elevation (gradient and aspect) and the second derivatives (plan and profile curvature). Gradient, also termed slope or slope angle, is the maximum rate of change of elevation. Aspect, also termed orientation or exposure, is the compass direction of this maximum rate of change. Profile curvature is the rate of change of gradient, while plan curvature is the rate of change of aspect.

In theory, these first and second order derivatives exist and can be calculated mathematically because terrain is a continuous surface. However, the raster grid format of a DEM is a discretised model of this continuous surface. Therefore, the first and second order derivatives must be approximated by computing a local continuous surface patch for each cell in turn, using the cell and its eight immediate neighbouring cells.


15

There are a variety of methods for making this approximation, which have been described and compared elsewhere (Burrough & McDonnell, 1998; Evans, 1972; Florinsky, 1998; Hodgson, 1995; Jones, 1998; Skidmore, 1989). Any GIS with surface analysis functionality will use one of these methods to calculate DEM derivatives.

2.1.4.2 Surface Topology Surface topology – the location of and linkages between surface features such as passes, peaks, pits, stream channels, ridges and watersheds – are terrain properties important to studying flow of material, particularly in hydrological modelling. The starting point for deriving these surface topology properties is to determine the direction of flow from each cell. A number of techniques have been developed for automatically deriving flow direction from a DEM (Band, 1986; Hutchinson, 1989; Jenson & Domingue, 1988; Mark, 1984). The most common of these are the D8 algorithm, which routes flow from one cell towards its steepest downhill neighbour, and the F8 algorithm, which shares flow to all its downhill neighbours on a gradient-weighted basis (Burrough & McDonnell, 1998). The flow direction grid explicitly describes the hydrological connectivity between cells. It can then be used to calculate parameters such as upstream area and the extent of watersheds and to delineate stream channels and ridges (Moore et al., 1991).

2.1.4.3 DEM-based Environmental Indices A wide variety of indices have been developed to help the spatial analyst model the scale of environmental processes. Many of these combine the DEM-derived parameters described above with other data sets such as land cover and soil maps. Three such indices are given below. A number of others can be found in Moore et al. (1991).

The topographic index (Equation 2.1), also known as the wetness index, gives a measure of the influence of topography on soil saturation levels. Equation 2.1: Topographic Index w

ln

As tan

where w is the topographic index, As is the upstream area and


is the slope gradient.

16

The cosine of the solar illumination incident on a slope, cosi (Equation 2.2), is one parameter in calculations of irradiance, the amount of solar energy falling on a surface (Burrough & McDonnell, 1998). There are a number of techniques for calculating irradiance involving cosi (Dozier, 1980; Dozier & Frew, 1990; Dubayah & Rich, 1995; Kumar et al., 1997). Equation 2.2: Cosine of Solar Illumination cosi = cosθ0cosβ + sinθ0sinβcos(φ0 – A) where θ0 is the solar zenith angle, φ0 is the solar azimuth, A is the slope‟s aspect and β is the slope gradient. The Universal Soil Loss Equation (Equation 2.3) estimates soil loss for agricultural land (Burrough & McDonnell, 1998). Both the slope length factor, L, and the slope factor, S, are derived from a DEM. The other factors can be derived from other data sets including precipitation, soil, land use and land cover maps. Equation 2.3: Universal Soil Loss Equation A=R*K*L*S*C*P where A is the annual soil loss in tonnes per hectare, R is the erosivity of the rainfall, K is the erodibility of the soil, L is a slope length factor in metres (see Equation 2.3a below), S is a slope gradient factor in per cent (see Equation 2.3b below), C is a land cover parameter and P is land management factor. Equation 2.3a: Slope Length Factor L

l 22.1

where l is slope length in metres

Equation 2.3b: Slope Gradient Factor S = 0.0065s2 + 0.0454s + 0.065 where s is slope gradient in per cent The above three examples give an indication of how DEMs are manipulated in many ways and combined with many other data sets for many different applications. Table 2.1, adapted from Burrough and McDonnell (1998), summarises this variety.


17

Table 2.1 DEM derivatives and their application Derivative

Description

Applications

Elevation

---

Potential energy determination; climatic variables; cut and fill calculations.

Gradient

Rate of change of elevation

Overland and sub-surface flow; land capability assessment; vegetation types;

Aspect

Compass direction Solar irradiance; evapotranspiration; of steepest downhill vegetation types. gradient

Profile curvature

Rate of change of gradient

Flow acceleration; erosion/deposition zones; soil and land evaluation indices.

Plan curvature

Rate of change of aspect

Converging/diverging flow; soil water properties.

Flow direction

Direction of downhill flow

Computing surface topology; material transport.

Upstream area

Number of cells upstream of a given cell

Watershed delineation; volume of material passing through a cell.

Stream length

Length of longest uphill path upstream of a given cell

Flow acceleration; erosion rates; sediment yield.

Stream channel

Cells with upstream area greater than a specified threshold

Location of flow; flow intensity; erosion/sedimentation.

Ridge

Cells with no upstream area

Drainage divides; vegetation studies; soil, erosion and geological analysis.

Wetness index

ln(upstream area/tan(gradient))

Index of soil saturation potential.

Stream power index

upstream area * tan(gradient)

Index of the erosive power of overland flow

Viewshed

Zones of intervisibility

Visual impact studies.

Irradiance

Amount of solar energy received

Vegetation and soil studies; glaciology.

Source: adapted from Burrough & McDonnell (1998)


18

2.2 Quality, Accuracy, Error and Uncertainty The issues of quality, accuracy, error and uncertainty are central to this thesis. This section reviews and discusses literature relevant to these four issues. The following section focuses on their importance in the context of DEMs. The different meanings of these four terms are not immediately apparent and indeed the terms are used interchangeably and ambiguously in some literature. Therefore, it is important to clearly define each term.

2.2.1 Quality The quality of a spatial data set depends on how well it represents reality (Ehlschlaeger & Goodchild, 1994) and also for what purpose it is to be used (Wood, 1994). There is no such thing as bad data, just data that do not meet the requirements of the intended application (Hunter, 1997). The quality of a spatial data set resulting from a spatial modelling application has three components: the quality of the input data sets; the quality of the analysis model, i.e. the processing applied to the data sets; and, how the qualities of the data and model interact (Burrough and McDonnell, 1998). There are a number of factors influencing the quality of a spatial data set and other factors influencing the quality of the analysis model, which are summarised in Table 2.2.


19

Table 2.2 Factors Influencing Quality Data Quality

Quality Of Analysis Model

1. Currency

1. Processing Numerical errors in the computer. Limitations of computer representation of numbers.

2. Completeness Is the whole study area covered? Are all necessary attributes given?

2. Choice of analysis model

3. Consistency Map scale Standard descriptions Relevance 4. Accessibility Format Copyright Cost

3. Misuse of logic 4. Choice of interpolation method 5. Error propagation 6. Classification and generalisation problems

5. Accuracy and Precision 7. Map overlay problems Density of observations Positional accuracy Attribute accuracy Topological accuracy Lineage When collected, by whom, how? Source: adapted from Burrough & McDonnell (1998)

2.2.2 Accuracy and Error Accuracy is one factor influencing the overall quality of a data set. Accuracy describes how similar a data set is to the real world or true values (Foote & Huebner, 1997a). Error is a specific measurement of the difference between a value in a data set and the corresponding true value. Goodchild et al. (1994) define accuracy as the difference between values recorded in a spatial data set and modelled or assumed values. They define error as the difference between the data set values and the true values. When dealing with continuous phenomena and their representation in GIS as surfaces, it is impossible to measure all true values and hence calculate all errors. So the true values must be modelled or estimated. So in the case of continuous phenomena one deals with accuracy indices derived from a limited number of error measurements.


20

2.2.3 Uncertainty Apart from accuracy, ways of quantifying the quality factors listed in Table 2.2 are not obvious. It is difficult to objectively judge the quality of a data set or an analytical model. Also, it can be difficult to judge how good a quality is required for a particular application. Although accuracy indices can be calculated from a sample of error measurements, these indices are only estimates of accuracy. This lack of knowledge about quality is expressed as uncertainty (UCGIS, 1997).

Errors, inaccuracies and consequently uncertainty exist in all spatial data sets and all analysis models (Burrough & McDonnell, 1998; Eastman et al., 1993; Goodchild et al., 1994; Openshaw, et al., 1991). Although this has long been known, prior to 1991 the uncertainty issue was largely ignored, although Burrough (1986) is an exception. Standard GIS texts such as Star and Estes (1990) barely mention quality, accuracy, error and uncertainty. During the 1990s researchers recognised the importance of addressing the uncertainty issue. The launch of the NCGIA‟s research initiative “Visualising the Quality of Spatial Information” in June 1991 demonstrates this recognition (Goodchild et al., 1994). Goodchild (1993) sets out three primary goals to adequately address the uncertainty issue: 1. Each object in a GIS database has information on its accuracy; 2. Every operation and process tracks and reports error; and, 3. Every GIS product has a measure of its accuracy.

Using data and results without considering uncertainty about their quality can lead to inappropriate policy and management decisions (UCGIS, 1997). A good understanding of error leads to active quality control allowing the analyst to manage and perhaps reduce errors, leading to better understanding of spatial distributions and processes and increased optimisation of sampling and analysis strategies (Burrough & McDonnell, 1998). With the appropriate understanding and tools for quantifying and communicating uncertainty, the quality and appropriate use of existing spatial data sets could be judged, leading to more cost effective use and decision makers will be able to evaluate the risks associated with various policy alternatives (UCGIS, 1997). Better understanding of uncertainty will also encourage greater acceptance of Geographic Information Science as a valid scientific discipline (Hunter, 1998; UCGIS, 1997).


21

Various authors have proposed strategies for managing quality and uncertainty (Foote & Huebner, 1997b; Hunter, 1998; Rossiter, 1995). Research has been undertaken to improve accuracy, but work has concentrated on a few data sets and much remains unknown (UCGIS, 1997). Attention has focussed on the error in measurements, how to assess and express error, propagation of error through processing stages and the reporting of data quality (Eastman, 1999b). There has been little attention to how various uncertainties combine to influence the whole spatial analysis process, final results and the decisions drawn from them. There is still not enough attention to how errors occur and how they are propagated. Most studies are at a research level, and few are sufficiently systematic in their approach (Burrough and McDonnell, 1998; UCGIS, 1997). Goodchild‟s (1993) goals have not yet been achieved. Eastman (1999, p. 1) states that “we need to recognise uncertainty not as a flaw to be regretted and perhaps ignored, but as a fact of the decision making process that needs to be understood and accommodated.” He predicts that improving our understanding of uncertainty will cause a move away from hard decisions to soft, probabilistic results from which a final decision is taken based on the level of risk one can afford.

2.3 DEM Quality Understanding quality and taking uncertainty into consideration are particularly important when using DEMs. First, as described in §2.1.4, DEMs are widely used and usually integrated and combined analytically with a variety of other data sets. A DEM of inadequate quality can have a widespread and significant impact on the outcomes of spatial analyses (Foote & Huebner, 1997a). Second, DEM derivatives can be very sensitive to error – small errors in a DEM can be amplified in its derivatives (Burrough & McDonnell, 1998; Guth, 1995; Wood, 1994).

DEM uncertainty relates to how confidently we can assume that the DEM is of adequate quality. The quality of a DEM is related to the degree to which the DEM represents the true terrain surface. DEM quality has three main components: the accuracy of the elevation values; the geomorphometric characteristics of the surface representation; and, the limitations of the model (Wood, 1996).


22

2.3.1 Elevation Accuracy There are horizontal and vertical components to the accuracy of any elevation data. On a topographic map a spot height error may be due to the point being in the wrong location or an incorrect elevation value. Although both components also apply to DEMs, the two are inseparable. The location of elevation values is fixed within the raster matrix. Whatever the type of error, the result is a DEM cell with an incorrect elevation value (Ehlschlaeger & Shortridge, 1997).

2.3.2 Geomorphometric Characteristics Geomorphometric characteristics relate to how well the DEM represents landforms and is a sensible model of the terrain surface. A DEM with elevation values rounded to the nearest metre may have accurate elevation values. However, there will be quantisation problems with DEM derivatives (Guth, 1995). For example, a gradient map derived from a DEM with a cell spacing of 10m and elevation values rounded to the nearest metre can only have gradients of 0%, 10%, 20% etc.. Artefacts or erroneous landforms may be formed by particular interpolation methods when generating the DEM. For example, linear interpolation of contour lines (see §2.4.3.1) can create ramps (Hutchinson, 1988), spline interpolation (see §2.4.3.4) can create deep pits (Wood, 1996) and inverse distance weighting interpolation (see §2.4.3.3) can create terraces (Burrough & McDonnell, 1998). Such artefacts cause erroneous calculation of first and second order derivatives and can inhibit hydrological modelling routines (Garbrecht & Martz, 1999).

2.3.3 Limitations of the Model The DEM‟s matrix of elevation values is a discretised representation of a continuous surface – it is an approximation of reality. The horizontal spacing of the matrix elements, the cell resolution, influences the fidelity of this approximation. A higher resolution DEM will represent smaller surface details, while a lower resolution DEM will only represent larger features. A DEM models the terrain surface at a particular scale that is related to cell resolution (Wood, 1996).

The scale at which a DEM models the terrain is also related to surface roughness. Sampling theory implies that a DEM can only represent features greater than twice as


23

large as the cell spacing – the Nyquist frequency (Theobald, 1989). So the type of topographic feature represented at a certain resolution depends on the scale of the terrain itself. Weibel and DeLotto (1988) and Pike (1988) further discuss the interaction between resolution and the geometric signature of the terrain. Wood (1996) examines the scale dependency of landforms classified from DEMs.

Environmental processes also operate at particular scales (Band et al., 1995; Garbrecht & Martz, 1999). The resolution of a good quality DEM will be related to the scale of process being studied and the scale at which the relationship between topographic variables (DEM derivatives) and landscape process is found (Florinsky & Kuryakova, 2000; Forman, 1995). Gao (1997) recommends that one use the coarsest possible resolution while meeting a defined quality requirement. However, there are few guidelines in the literature as to how one should determine whether a particular resolution meets your quality requirement (Garbrecht & Martz, 1999). Florinsky and Kuryakova (2000) propose an experimental statistical method, but this area of study is evidently still in its infancy.

Resolution and its relationship to process and landform scale clearly play a key role in determining DEM quality. However, the research presented here is concerned with DEM quality in general and not with modelling specific landforms or processes. Therefore, the remainder of this thesis concentrates on the first two components of DEM quality: accuracy of elevation values and geomorphometric characteristics.

2.4 Causes of Reduced DEM Quality In addition to the conceptual limitations of a discretised matrix approximating a continuous surface, there are a number of potential sources of inaccuracy in the DEM generation process. It is well known that the source data for creating DEMs and the generation process are not perfect (Kyriakidis et al., 1999). Literature on the sources of these imperfections is reviewed in three parts: quality of data sources; quality of sample distribution; and, quality of interpolation.


24

2.4.1 Quality of Data Sources A key reason for the increased economic viability, and hence wider availability and application, of DEMs over the last 30 years is the technological advances that have been made in acquisition of terrain data. These advances involve both the development of new data sources, such as higher resolution satellite imagery and LIDAR data (Lillesand & Kiefer, 2000), and the development of new or improved techniques and equipment for extracting elevation data in a digital format, such as electronic tacheometers, analytical stereoplotters and semi-automated cartographic digitising (McLaren & Kennie, 1989).

There is now a wide range of digital elevation data sources for the practitioner to choose from. A suitable choice of source data is related to the size of the area being modelled, the scale of the model, the resources available (time, money and personnel), the type of information to be derived (i.e. the application of the DEM) and the accuracy required (Kennie, 1990). Choice of data source is an important decision, which requires knowledge of the relative advantages, costs and limitations of the various data sources, and also, the techniques for extracting digital elevation data from these. This holds true, whether the digital elevation data are being obtained "in-house", or as "off-the-shelf" products, e.g. from national mapping agencies.

The following section outlines the sources of digital elevation data under the three main headings of ground survey sources, photogrammetric sources and cartographic sources. The quality of LIDAR data sources is not examined here due to the novelty of this data source and consequently the lack of availability for most parts of the world. LIDAR data are of high quality and may be of great use in the future either to create DEMs or, more probably, to check the quality of DEM (see §3.2.3.2).

2.4.1.1 Ground Survey Sources Ground survey sources of elevation data involve actual measurement of elevation in the field. Electronic tacheometers, also known as total stations, can be used to create a network of surveyed points covering an area (Kennie, 1990). The equipment allows highly accurate planimetric and altitudinal measurements. The data quality is also enhanced by the use of the surveyor's local knowledge to adapt the survey to incorporate key features and significant sample points, which provide a good representation of the


25

terrain (Stocks & Heywood, 1994; Weibel & Heller, 1991). However, ground surveying is time consuming, restricting its application to small areas. DEMs generated from ground surveys are typically applied to site-specific projects, particularly in civil engineering, or used to supplement photogrammetric data, for example to provide data for wooded areas (Weibel & Heller, 1991). The hazards and inaccessibility of mountain environments preclude the use of ground surveys for DEM generation (Stocks & Heywood, 1994).

GPS equipment has been considered as potentially more practical for ground survey in mountain environments (Heywood et al., 1999). Equipment can be cheaper and more portable than traditional surveying hardware and data collection can be faster. However, a number of limitations do not currently make GPS a viable alternative for collecting a complete data set for creating a DEM. These limitations include inaccessibility, the trade off between portability and accuracy and between speed and accuracy, the high potential for poor satellite visibility and the difficulties of differential correction in mountainous regions (Carlisle & Jordan, 1999). However, as described in §3.2.3.2, GPS can play a role in assessing DEM accuracy.

2.4.1.2 Photogrammetric Sources Aerial photography and satellite imagery can be used to derive digital elevation data. At present, the resolution of suitable imagery and the sophistication of processing mean that elevation data derived from satellite imagery is usually only suited to small scale applications covering large areas, such as national, continental or global studies (Fukushima, 1988; Lillesand & Kiefer, 2000). Therefore, satellite imagery data sources are not discussed here.

Stereoscopic and digital image correlation techniques applied to aerial photography provide the most widely available sources of digital elevation data (Stocks & Heywood, 1994). Both the USGS and the Ordnance Survey derive their height data from stereoscopic interpretation of aerial photography (USGS, 1998; Ordnance Survey, 1999).

The accuracy of digital elevation data captured from aerial photographs by stereoscopic techniques is dependent on three groups of factors: the imagery; the equipment; and, the sampling method (Weibel & Heller, 1991; Petrie, 1990a).


26

The properties of the imagery that influence the accuracy of data capture are: scale and resolution of the aerial photography; the flying height at which the photography was obtained; and, the base to height ratio, or geometry, of the overlapping photographs. These properties are inter-related and it is their combined nature, which determines the accuracy of the stereoscopic techniques. Both the minimum vertical interval of the contours that can be derived and the scale of the DEM that can be produced are also dependant on these image characteristics. A compromise must be made between the geometry and scale, or resolution, of the photography. A common choice is a wide angle lens, for example f=15cm, and a base:height ratio of 0.6. This allows a single spot height to be measured with an accuracy of between 1/5000th to 1/15000th of the flying height, dependent on the equipment used. This translates to an RMSE of 0.1m to 3.0m on the ground (Petrie, 1990a).

In terms of the influence of equipment on the accuracy of the actual elevation measurements, the differences between the various makes and models of photogrammetric equipment currently on the market are relatively insignificant. However, the interactive editing and data verification capabilities of analytical stereoplotters produced since the late 1980s give such equipment an important advantage over older equipment. The likelihood of operator error is reduced and the performance of the sampling pattern used can be verified. Additionally, the greater efficiency of such equipment in terms of speed of data capture and editing allows a greater quantity, or denser pattern, of data to be collected (Petrie, 1990a). This difference is important because much of the elevation data provided by national mapping agencies was produced prior to these technical innovations. The Ordnance Survey and USGS data originated in the 1970s or earlier (Ordnance Survey, 1999; USGS, 1998).

The stereo-model is a representation of the continuous topographic surface. Due to time and data volume constraints, digital elevation data can only constitute a sample of elevations over this continuous surface. The choice of sampling method has an important influence on the accuracy of the elevation measurements at these sample locations. Either contour line or point sampling methods can be used. Contour line sampling involves extraction of data as strings of coordinates of equal height. This dynamic measurement of altitude is a rapid data capture method. Consequently the accuracy of any one point on the contour line is lower than for point sampling techniques discussed


27

below. The accuracy is dependant on the contour interval, which in turn is determined by the flying height. The minimum possible contour interval that can be measured is approximately 1/1000th to 1/2000th of the flying height. At a low flying height of 1000m, contour intervals of 1.0m or less can be defined. In terms of RMSE, contour line sampling can provide measurements accurate to within 0.3 of the contour interval. If contours at a 1m interval are measured, measurements with an RMSE of 0.3m can be obtained. Point sampling involves static measurement techniques at each individual sample point. Therefore, more accurate data can be obtained than for the dynamic, or onthe-fly, contour line sampling method. The points from a contour line sample will have approximately three times the RMSE of single point samples (Petrie, 1990a).

2.4.1.3 Cartographic Sources Most DEMs with a grid spacing of 50m or less are produced from cartographic data sources (Weibel & Brändli, 1995). A number of national mapping agencies, such as the UK‟s Ordnance Survey (Ordnance Survey, 1999) or the United States Geological Survey (USGS, 1998), have digitised versions of their topographic maps and can provide contour lines and spot heights already in digital format. Contour lines and spot heights on most topographic maps have originally been derived using the stereoscopic techniques described above (Carrara et al., 1997; Kyriakidis et al., 1999; Petrie, 1990b). When national mapping agencies initiated digital elevation data projects cartographic sources were considered safest and most cost effective. Good map coverage was available and contours were easily digitised in a mass production mode. The digital elevation data are predominantly derived from contour lines, but may be supplemented with spot height data (Stocks & Heywood, 1994; Petrie, 1990b). The analogue map data can be converted to digital format by manual digitising or semi-automated line following (Weibel & Heller, 1991; Petrie, 1990b).

Petrie (1990b) gives a thorough description of the accuracy of manual digitising and semi-automated line following. Accuracy is influenced by equipment, map distortion and operator or machine error. The accuracy of the equipment, in terms of the coordinates recorded, will be slightly less than the resolution, or precision, of the equipment. This precision will be in the range of 50μm to 100μm. At a map scale of 1:10 000 this equates to 0.5m to 1m. Over a large format paper-based map sheet distortion, in the form of stretch or shrinkage, of several millimetres can occur. At the 1: 10 000 scale


28

this equates to an on-the-ground error of 10m or more. The distortion on more stable plastic map sheets, the originals used by cartographic agencies, is much less than this. Regular scale changes can be removed to reduce the severity of distortions by recording the apparent location of control points on the map and then performing affine transformations on the digitised data. Operator and machine error is of more significance. On-line display of the digitised data and interactive editing reduce the likelihood and extent of such error. However, the potential for undetected errors remains important. Additionally, the extent of these errors is hard to quantify as it is determined by the skill and patience of the individual operator.

Taking into consideration all the above error sources, Petrie (1990b) states that the horizontal accuracy of digitised contours will lie in the range of 0.1mm to 0.25mm RMSE. This equates to 1m to 2.5 m on the ground at 1:10 000 scale. When obtaining digital elevation data from cartographic contour lines, a vertical error can only be caused by the operator tagging the wrong elevation value to a contour line. However, although displacement of a contour is horizontal, this will cause an apparent vertical error by the time a continuous DEM surface is generated (Ehlschlaeger & Shortridge, 1997; Miller & Morrice, 1996).

It is important to note that the accuracies described above relate to the difference between the digital and analogue contour lines. The original cartographic production process will have introduced further inaccuracies. Eklundh, and Mårtensson (1995) note that the purpose of contour lines drawn on cartographic maps is to give a visual representation of the terrain, usually in combination with information about various other phenomena such as land use, transport networks and buildings. Cartographic generalisation is employed to give a clearer visual representation of the features present. This can involve actions such as displacement of contour lines and replacement of close or adjacent contours with rock outcrop symbols in areas of steep terrain. Clearly this will affect the accuracy of a DEM generated from such a source. Contour lines on maps were not originally intended for generation of DEMs and the quantitative techniques of spatial modelling (Weibel & Heller, 1991).

A second important point concerns the method by which analogue contour lines are produced. Fig. 2.2 gives an example by summarising the process of creating the Ordnance Survey‟s Landform Profile digital contour data product. It can be seen that


29

these data involve a combination of ground survey, photogrammetry and digitising. The accuracy of the digital contour lines will be a function of many of the causes of inaccuracy described above. This illustrates how the user of a DEM should be aware of all stages of the DEM production process and their potential inaccuracies.

Aerial photography flown since the 1960s.

Ground survey of significant points.

Photogrammetric analysis

Contour lines.

Spot heights.

Cartographic generalisation

Topographic information on 1:10,000 map series.

Semi-automated line following

Landform Profile digital contour data.

Fig. 2.2 Production process for the Ordnance Survey‟s Landform Profile digital contour data. Source: Ordnance Survey (1999).


30

2.4.2 Quality of Sample Distribution The above discussion of data sources has described the accuracy of measurements at single points or along contour lines. However, these accuracies only relate to the sample points. The accuracy of the DEM generated from this sample data will be heavily dependent on how representative of the whole topographic surface these sample points are. This is impossible to quantify, but relates to the sampling pattern used. Makarovic (1979, p.146) defines sampling as “the transition from the real terrain surface to a structured set of numerical data”. It is only practical to describe the surface continuum with a finite number of elevation samples. This discretisation of the surface continuum results in loss of information whatever sampling technique is employed (Blais et al., 1986). The level or significance of this information loss depends on the sampling pattern, i.e. the spatial arrangement of sample points, and the sampling density in relation to the characteristics of the terrain surface being sampled and the intended application of the resultant surface model (Stocks & Heywood, 1994; Weibel & Heller, 1991; Blais et al., 1986). The quality of a sample distribution can be defined according to the efficiency of the elevation data collected, i.e. the relationship between the level of detail provided in the subsequent DEM and the amount of effort expended collecting the data (Köstli & Wild, 1984).

The following sections review the merits and shortcomings of common sample patterns. Working in tandem with the sample pattern is the influence of sample density on the efficiency of the resultant elevation model. Sample density issues are inextricably linked to sample pattern issues and are thus raised during the investigation of sample pattern.

2.4.2.1 Contour Line Sampling In addition to the elevation accuracy limitations discussed earlier, there are also limitations to the use of contour lines in terms of sample distribution, particularly when they have been digitised from topographic maps. The fundamental cause of these restrictions is again that the primary purpose of contour lines is terrain visualisation rather than digital modelling (Eklundh & Mårtensson, 1995; Weibel & Heller, 1991). Therefore, the sample distribution is often far from optimal, giving an excessive sample density along the length of contour lines, but no data between lines (Weibel & Heller, 1991). Contour lines represent a statistically biased sample distribution. However, this Chapter 2 – DEMs, Accuracy, Quality and Uncertainty

31

bias is reduced on steep terrain where contour lines lie closer together (Eklundh & Mårtensson, 1995). Second, although contour data implicitly sample significant terrain features such as slope discontinuities, ridge and stream lines, represented as line sections of locally maximum curvature (Hutchinson, 1989), interpolation algorithms require such information to be expressed explicitly, if the structural detail is to be passed on to the resultant DEM (Eklundh & Mårtensson, 1995).

2.4.2.2 Regular Grid Sampling and Profiles Regular grids and regular profiles locate samples at fixed intervals (Weibel & Heller, 1991). They are the simplest and most widely used forms of sampling pattern (Petrie, 1990a). A regular grid is the usual pattern of data derived from satellite imagery and aerial photography (Stocks & Heywood, 1994; Weibel & Heller, 1991). Profiles may be created from the contour lines of topographic maps, locating sample points at the intersection of the specified profile lines and the map‟s contour lines (Carter, 1988). Profile sampling may also be employed with stereoscopic capture techniques. In this case, profile lines are specified and samples are collected at critical points along the profiles, i.e. where there are significant changes in gradient, with supplementary points collected at intermediate locations so as to provide a good scatter of samples (Carter, 1988).

The principal advantages of these two sample patterns are the simplicity, the implicit topological relationships between sample points and, once the grid or profile interval has been determined, the objective and automatable selection of sample points (Makarovic, 1979). This means that data collection is fast, at least in terms of the time taken to capture each point. Additionally, use of a fixed interval gives a statistically unbiased distribution of sample points (Eklundh & Mårtensson, 1995).

As would be expected of a simple and automatable technique, these sample patterns suffer from a lack of sophistication. A first disadvantage of regular sampling is the lack of local adaptation of the sample density relative to the terrain (Stocks & Heywood, 1994; Weibel & Heller, 1991; Makarovic, 1979). The primary goal of elevation data collection is to produce a sample from which a digital model can be derived that represents the smallest terrain feature of interest (Stocks & Heywood, 1994; Petrie, 1990a; Balce, 1987). Often it is impractical to employ a sufficiently fine sampling


32

interval. Then, the lack of local sample pattern adaptation to the terrain represents a high sampling effort, which generates redundant data in smooth terrain, yet generalised or smoothed surfaces in rugged terrain (Stocks & Heywood, 1994; Makarovic, 1979). For these reasons, Weibel & Heller (1991) recommend regular sampling for homogeneous, low relief terrain. However, Stocks & Heywood (1994) suggest that in mountain environments there is often a consistently high degree of terrain heterogeneity and, therefore, the problem of data redundancy is not as acute as elsewhere.

A number of techniques have been developed to determine optimum sampling interval for regular grids and profiles. Examples of these techniques are based on the use of spectral analysis (Fritsch, 1984, cited Blais et al., 1986), self-similarity concepts (Frederiksen et al., 1983 & 1984, cited Blais et al., 1986), linear interpolation criteria (Balce, 1987; Frederiksen et al, 1984, cited Blais et al., 1986) and regionalised variable theory (Blais et al., 1986). Balce (1987) and Blais et al. (1986) evaluate the performance of each of these techniques on various types of terrain. In both cases the conclusion is that no technique performs consistently better than the others and there is no evidence that use of any of the techniques gives a reliable estimate of optimum sampling interval. Balce (1987) found that these techniques were most unreliable in rougher terrain, as is characteristic of mountain environments. Given the mathematical and statistical complexity and unclear benefit of the above techniques the intuitive determination of sampling interval is often more appropriate and adequately effective (Eklundh & Mårtensson, 1995).

2.4.2.3 Progressive Sampling To resolve the limitations of regular sampling due to the lack of local adaptation to the terrain heterogeneity, Makarovic (1973) developed the progressive sampling technique. An initial coarse grid of elevations is sampled, then the terrain roughness of 3 x 3 patches is analysed. If the terrain roughness of a patch exceeds a user-defined threshold, that patch is resampled at double the original density, i.e. half the original grid interval. This iteration is repeated either a fixed number of times (Ebner & Reinhardt, 1984), or until no patch exceeds the terrain roughness threshold, which is commonly achieved within three or four iterations (Petrie, 1990a).


33

This technique restricts higher density sampling to areas of more varied relief and therefore reduces data redundancy, while maintaining the surface representation quality (Stocks & Heywood, 1994; Weibel & Heller, 1991; Ebner & Reinhardt, 1984). Makarovic (1979) claims the technique inherently steers sampling towards collection of a comprehensive data set, in which all significant terrain features are adequately sampled. However, loss of some significant terrain features due to under sampling is possible (Stocks & Heywood, 1994; Makarovic, 1979) and the use of regular grid densification to adjust sample density does not avoid data redundancy in sampling the most abrupt changes in terrain (Weibel & Heller, 1991; Balce, 1987; Makarovic, 1979).

Application of progressive sampling is in practice limited to photogrammetric techniques, but in such situations can be automated (Weibel & Heller, 1991).

2.4.2.4 Selective Sampling If regular and progressive sampling techniques can be described as objective, although the user must specify an initial sampling interval, then selective sampling is subjective. It relies on the proficiency of the data collector, either a surveyor in the field or an operator of photogrammetric equipment, to select the points and lines which are topographically most significant in terms of terrain surface representation (Weibel & Heller, 1991). Significant point features include peaks, pits and saddles, while significant line features include ridges, stream channels, cliffs, faults and other abrupt changes in gradient.

The principal benefit of this technique is the opportunity for the data collector to include field knowledge or aerial photograph interpretation expertise in the data capture process. This allows a more robust DEM to be generated in which all key terrain features are effectively represented (Stocks & Heywood, 1994; Weibel & Heller, 1991). Indeed, it is only through selective sampling of these significant surface features that an efficient and high quality DEM can be generated (Ebner & Reinhardt, 1984). Selective sampling is the most efficient technique in terms of data volume for accurately representing surface discontinuities (Weibel & Heller, 1991).

The use of selective sampling has drawbacks. First, the subjectivity of the method equates to a reliance on the data capturer‟s ability, which is difficult to quantify and is


34

often unknown (Makarovic, 1979). Second, the high level of human intervention prevents the automation of sample selection, thus slowing down the data capture process (Weibel & Heller, 1991; Makarovic, 1979). Third, in areas lacking in significant surface features the resultant data set will be sparse, placing an increased onus on interpolation to model the continuous surface (Lam, 1983). This last drawback is less serious in rugged mountain terrain than other areas. Nonetheless, used in isolation, selective sampling is likely to be prohibitively time consuming if a sufficient density of points and lines are to be captured for an adequate surface representation to be generated.

2.4.2.5 Composite Sampling The term „composite sampling‟ was originally coined by Makarovic (1979) to describe the combination of two or more of the above sampling patterns. The above sections demonstrate each sampling pattern has its merits. A single technique will rarely fulfil the two criteria of high quality surface representation and efficiency in terms of data volume and data capture effort. The potential advantages of using a combination of sampling patterns are therefore apparent. Balce (1987) states that optimum sampling must involve use of regular, progressive and selective techniques. Makarovic (1979) asserts that optimum sampling should inherit the merits and minimise the shortcomings of both the regular grid and selective sampling. He recommends a combination of progressive and selective sampling techniques. In practice composite sampling is restricted to a combination of selective sampling and one of the other techniques, commonly contour or regular grid sampling (Eklundh & Mårtensson, 1995; Tuladhar & Makarovic, 1988).

2.4.2.6 Conversion of Sample Distribution As noted by Makarovic (1979), the sampling of data and the arrangement into a convenient format may be a two-stage process. After the initial collection of elevation data it is possible to rearrange the distribution of data points through interpolation (see §2.4.2.7). This is often desirable as one distribution may be best for data storage and/or distribution, while another is more appropriate for data manipulation (Peucker, 1980). For example, the Ordnance Survey‟s Landform Panorama 50m regular grid data, the Swedish Land Survey‟s 50m regular grid data, and the USGS 30m regular grid data are all converted from aerial photography derived contour lines (Ordnance Survey, 1992; Ekstrand, 1993; Light 1993). Whatever the nature of the conversion, each transformation


35

results in loss of information and consequently a more generalised surface (Carter, 1988).

2.4.3 Quality of Interpolation There is a wide range of interpolation techniques, which may be used in any application areas where a phenomenon of interest is continuously distributed over space and can therefore only be measured at sample locations on this surface. For a comprehensive review of all interpolation techniques see Lam (1983), Schut (1976), Schumaker (1976) or Rhind (1975). This section describes interpolation techniques that are suitable for generating DEMs from elevation data and examines their merits in terms of DEM quality.

The underlying principle of the interpolation process is that the surfaces to be estimated can be assumed to possess certain characteristics or behavioural properties, which can be mathematically replicated by an interpolator‟s algorithm (Lam, 1983). This principle does not reflect reality, where every terrain surface is unique and few if any terrain surfaces demonstrate definable behavioural properties. This is particularly so in the often rugged and heterogeneous landscapes of mountain environments. Nonetheless, the underlying surface behaviour assumptions of certain interpolators give reasonable approximations of particular types of terrain surface, although no single interpolation technique will give superior results in all cases. The success of any interpolator is not only a reflection of the suitability of the underlying surface behaviour assumptions, but is also heavily dependant on the density and distribution of sample data.

2.4.3.1 Contour Line Interpolation Interpolation from contour data is an interesting case, in that there is a very high density of sampling along the contour lines. This provides finely detailed terrain information (Hutchinson, 1988). However, in contrast, there is a complete lack of data between the contour lines. This can lead to problems such as flat-topped peaks and abrupt changes in elevation at saddle points and irregularly shaped widely spaced contours (Clark Labs, 2001).


36

There are comparatively few techniques, which work specifically on isoline data, i.e. contour lines in DEM generation applications. Additionally, all of these techniques work on the same principle. It is possible to dismantle the data into a series of points representing the contour line vertices and then use any of the techniques for point data described below. However, it is intuitively preferable to retain the full fineness of detail provided by the contour lines and use a contour specific algorithm (Hutchinson, 1988).

The principle behind contour interpolation algorithms is that interpolating along straightline profiles running between adjacent contour lines gives the best estimate of elevation between contours (Weibel & Heller, 1991). Some algorithms orientate these profiles in pre-determined directions, commonly towards the cardinal points of the compass. More sophisticated algorithms use the direction of steepest gradient. This should extend the fine structure of the contour lines to the intermediate areas (Hutchinson, 1988). The interpolator along the profiles can either be linear or cubic (Weibel & Heller, 1991).

In addition to interpolating elevation values, Yoeli (1986) and Wood (1994) generate local measures of interpolation uncertainty as part of the contour interpolation process. This is done by interpolating height values along all four profiles. The variability of the four interpolated values, expressed as RMSE, gives a measure of uncertainty associated with using that interpolation technique.

Contour algorithms are simple both in principle and computationally, but there are limitations. The supposedly more sophisticated algorithms, which interpolate along the line of steepest gradient, often restrict their search to the cardinal directions anyway (Hutchinson, 1988). The techniques make no use of the most important structural features implicit in the contours, namely points of locally maximum curvature, which indicate the location of ridge and stream lines, to improve the search for lines of steepest gradient (Hutchinson, 1988). Hutchinson‟s work (1988, 1989; §2.4.3.6) is an exception.

2.4.3.2 Point Interpolation Local, also described as piecewise, interpolation methods provide a more realistic representation of the heterogeneity of a terrain surface by separating the surface into patches and customising the interpolation to each patch using point data. Local interpolation is based on Tobler‟s First Law of Geography, which states that all things


37

are related, but near things are more related than those further apart (Longley et al., 2001). The elevation of a location can be estimated by considering the elevations of nearby locations. Patch selection involves defining which nearby points should be used to estimate the unknown elevation. There are two commonly used approaches to patch selection:

1. select all points within a fixed radius of the cell to be estimated (Fig. 2.3a); or, 2. select the n closest points (usually between 4 and 12 or more points are used, 6 is common) (Fig. 2.3b).

Both of these patch selection approaches have shortcomings. The fixed search distance procedure can cause no sample points to be selected in flat terrain (Fig. 2.3c), in which case the second selection procedure may be temporarily invoked, or, conversely, on steep terrain too many points may be included, which can cause loss of surface detail (Lam, 1983).

The fixed number of points procedure is sensitive to clustering of sample points, which would then have too great an influence on the interpolated value (Fig. 2.3d; Lam, 1983).


38

a)

b)

c)

d)

Fig. 2.3 Patch selection procedures: a) using a fixed radius; b) using the 6 closest points; c) no points within fixed radius; d) 6 closest points all lying on the same contour. The above problems frequently occur when interpolating from the vertices of contour lines, because of the highly uneven distribution of sample points described in §2.4.2.1 (Burrough & McDonnell, 1998). The problems can be mitigated by dividing the area around the cell to be interpolated into quadrants or octants and selecting the nearest one or two points from each sector (Fig. 2.4). While improving the patch selection in a sample point clustering situation, this modification can introduce other problems, such as sample point “shadowing” (Gold, 1989) or empty sectors near data area boundaries (Clarke, 1990).


39

Fig. 2.4 Dividing the search area into quadrants and selecting two points from each quadrant. A final shortcoming is that, with the exception of Kriging, local interpolators rely on the user having the experience and knowledge to provide suitable patch selection criteria, either as a fixed number of points or as a fixed distance search radius (Gold, 1989).

Despite these limitations, local interpolation techniques are the most widely available and most widely used. Computation is relatively easy and the techniques give reasonable results at medium to large scales (Stocks & Heywood, 1994). There are a wide variety of local interpolators. The following sections describe three commonly used techniques: inverse distance weighting, spline, and, Kriging.

2.4.3.3 Inverse Distance Weighting Having selected neighbouring sample points as described above, inverse distance weighting techniques assign a weight to each sample point‟s elevation value based on the distance of the point to the cell being interpolated. Closer sample points are assumed to have a greater influence on the interpolated value, i.e. the weighting is proportional to the inverse of the distance (Burrough & McDonnell, 1998). Equation 2.4 shows a generalised notation of the algorithm.


40

Equation 2.4: Generic Inverse Distance Weighting Algorithm N

z p d pw zi, j

1 N

d pw 1

where: z p = the elevation of sample point p

d p = the distance from the grid cell to sample point p w = the weighting N = the number of sample points in the grid cell‟s neighbourhood z i , j = the grid cell‟s estimated elevation

There are other characteristics particular to inverse distance weighting, in addition to those general limitations of local techniques described previously, which influence the usability and outcomes of this type of technique.

First, the user must specify the weighting to be used. This choice is somewhat ambiguous when the underlying characteristics of the surface being interpolated are not known (Lam, 1983). A weighting of d-2 is commonly used, and any where between d-1 to d-6 may give optimal results, depending on the nature of the terrain. The higher the weighting given to the closest points, i.e. towards d-6, the greater the incidence of breaks of slope and local minima and maxima will be and the resultant surface will be less smooth (Gold, 1989).

Second, inverse distance weighting is essentially a smoothing procedure (Lam, 1983). Interpolated values can never exceed the minima and maxima of the sample point elevations. Increasing the weighting towards d-6 can minimise, but never overcome, the effects of this characteristic. Peaks, pits, ridges and streamlines, important features of the landscape, are poorly interpolated.

Third, inverse distance weighting can be heavily affected by non-uniform distributions of sample points. This is not only with regard to clustering as described above, but also linear grouping of sample points as would occur if the vertices of contour lines were used as the data source (Mitasova et al., 1996; Eklundh & Mårtensson, 1995; Fairfield & Leymarie, 1991). Such a distribution of sample points can result in a heavily terraced DEM, with a majority of interpolated values having the same elevation as the source


41

contours. This produces a DEM of limited reliability in applications such as soil erosion path modelling or definition of drainage networks. The terracing problem can be alleviated by applying a smoothing filter to the interpolated DEM. However, although this may increase the “realism” of the surface representation, such action may easily result in an overall degradation of the DEM by smoothing away accurate terrain details (Brown & Bara, 1994; Mark, 1984). Another approach would be to reduce the spatial bias of the sample by thinning out the data points along the contour lines prior to conversion to point entities (Eklundh & Mårtensson, 1995).

A line generalisation

procedure such as the Douglas-Peucker algorithm is suitable. The influence of these two actions on decreasing elevation accuracies, degradation of terrain details and reduction of the terracing effect have not, however, been researched in any detail.

The advantages of inverse distance weighting are: the simplicity of the underlying principle; computational efficiency and, hence, high processing speed; the flexibility of the algorithm for customisation to different types of terrain surface, by adjusting the nearest neighbour search procedure and the weighting; the reasonable results obtained in many situations and applications, including DEM generation; and, hence, the wide availability of inverse distance weighting functions in GIS packages. (Burrough & McDonnell, 1998; Gold, 1989; Lam, 1983)

2.4.3.4 Splines Spline techniques select certain nearest neighbour data points, subject to the limitations described above, and then fit a mathematical surface patch, described by a bicubic spline function, through these points (Burrough & McDonell, 1989). Using spline functions ensures that the join between one patch and the next is continuous. Also, the local curvature of a patch can be constrained to a minimum (Gold, 1989). Spline interpolation is a fast method (Mitasova et al., 1995) and creates very smooth DEM surfaces (Burrough & McDonnell, 1998).

Spline methods allow interpolated values to lie outside the range of input sample point heights, so the problems of flat topped peaks and flat bottomed valleys caused with inverse distance weighting and contour line techniques do not occur. Splines offer other potential improvements over inverse distance weighting, including better retention of small scale features and hence a more accurate interpolated surface (Burrough &


42

McDonnell, 1998), although this is optimal on comparatively smooth terrain (Eklundh & Mårtensson, 1995). Mitasova et al. (1996) and Eklundh & Mårtensson (1995) find that spline methods tend to produce more morphologically sensible surfaces, being less prone to terracing and spurious pit problems.

2.4.3.5 Kriging Kriging techniques were developed to resolve ambiguities of specifying weights and search parameters for inverse distance weighting and other local methods. The underlying principle is regionalised variable theory, which breaks down the spatial variation of elevation (or other attributes) into three components: underlying drift, spatially correlated random variation and random noise. Basing interpolation on statistical theory allows statistically optimal interpolation, involving the most reliable selection of interpolation parameters, and a measure of each interpolated point‟s estimation variance, which the algorithm minimises (Clarke, 1990). This latter feature, which gives an indication of the interpolation accuracy, is useful for determining the likelihood that a surface feature in the model is reliable and for indicating where additional data sampling in the future would be most beneficial (Lam, 1983).

The technique is undoubtedly statistically the most superior and in many situations will provide a numerically more accurate surface than other techniques. However, Kriging is not best suited to interpolating DEMs for a number of reasons. First, Kriging works best on data with well-defined local trends, allowing easy estimation of the semi-variogram form (Clarke, 1990). Terrain surfaces generally, but mountain terrain in particular, are typically heterogeneous and, therefore, may often be inappropriate to the regionalised variable model (Lam, 1983). Second, the improvement in reliability and surface accuracy must be weighed against increased computational costs and the need for a user with a level of statistical competence (Clarke, 1990). In fact, Lam (1983) and Kubic & Botman (1976) assert that, except where the sparsest sample data are available, Kriging techniques offer little improvement over other techniques. Third, Kriging is as susceptible to generation of terracing and spurious pits as inverse distance weighting (Eklundh & Mårtensson, 1995). Fourth, the principle of regionalised variable theory and the associated Kriging methodology is highly unintuitive (Eklundh & Mårtensson, 1995).


43

2.4.3.6 Terrain Specific Interpolators The interpolation techniques described so far can all be applied to any phenomenon that is continuously distributed over space. There are a number of interpolation techniques that have been developed specifically for interpolating elevation data to create a DEM. These techniques aim to avoid specific DEM problems, such as spurious pits, or to maximise the efficacy with which certain data sets, such as contour lines, are used. Hutchinson (1988, 1989) has developed algorithms that automatically avoid creating spurious pits and automatically determine ridge and channel lines from areas of maximum curvature along contour lines. This creates DEMs that are ready for use in hydrological studies and which retain much of the detail provided by contour lines‟ fine structure. Hutchinson‟s ANUDEM interpolation procedure is available as part of ArcInfo GIS functionality. However, such specialist techniques are not currently provided by commonly used commercial GIS packages.

2.4.3.7 Selecting an Interpolation Method As stated previously, no interpolation method is superior in all situations. In addition to the influence of sample data distribution, density and accuracy, the selection of interpolation method has an important influence on the resultant DEM. Theoretically, a good technique can be qualified as one which uses a minimum number of sample points to generate a DEM that represents all the main elements of the landscape, such as ridges, valleys and peaks (Stocks & Heywood, 1994). In practice a more quantitative approach may be preferable, such as determining the elevation difference between the DEM and a more reliable representation of the surface or the surface itself. As outlined above, the particular characteristics of each type of interpolation technique determine that technique‟s benefits and disadvantages in terms of usability and quality of surface representation. The points below summarise the varying characteristics of the different techniques into a series of selection criteria that a potential user of interpolation might address:

1. What form of sample data is available (contours or points)? 2. How dense/sparse are the sample data and how even is the spatial distribution pattern?


44

3. How competent is the user? or Is the user familiar with use of a particular method? 4. Will the interpolation technique be used in varying types of terrain? If so, can the technique be adapted to maximise performance in each of these terrain types? 5. Should the resultant surfaces be comparatively rough or smooth? 6. Is a numerically accurate surface required? How accurate? 7. Is a morphologically sensible surface required? (Lam, 1983; Weibel & Heller, 1991; Peucker, 1980)

2.4.4 Quality Considerations when Creating DEMs The user of digital elevation data must be aware of the entire data creation process if an accurate assessment of the quality of source data is to be made. There are two possible scenarios with respect to the influence of source data on DEM quality. First, the DEM user obtains digital elevation data from a third party, such as a national mapping agency or other data provider. In this situation, there will frequently be no opportunity to select between different data sources and sampling techniques. Quality considerations are then limited to ensuring that the user gains sufficient knowledge of the techniques used to collect the data and is aware of the relative merits of the particular sampling technique used. Second, the user undertakes or is involved in the data collection process. Given this scenario, contour line sampling, from topographic maps or stereoscopic aerial photograph techniques, or regular grid sampling from stereoscopic aerial photograph techniques are likely to be most practical. The wide availability of these data sources and the simplicity and adequate efficiency of the sampling techniques are advantageous. However, the added benefit of supplementing such sampling with selective sampling is significant and should be implemented whenever possible.

The different interpolation techniques available produce significant variations in the output DEMs. When interpolating a DEM from the digital elevation data, it is worth experimenting with different interpolation methods and varying search radii, weights and other parameters and checking the effect on the quality of the output DEM. Different techniques for assessing this quality are considered in Chapter 3.


45

Chapter 3: Assessing DEM Quality This chapter presents research undertaken to comprehensively examine the quality of DEMs. First, the rationale, aims and objectives of the research are stated. Second, background information is given by means of reviewing relevant literature. Then the methodology and results are given before finally discussing the results and drawing conclusions.

3.1 Rationale, Aim and Objectives As introduced in Chapters 1 and 2, uncertainty in spatial data, particularly DEMs, is an important, but under-researched issue. There is a need for further, more systematic research into the nature of DEM quality.

Previous research has tended to focus on one aspect of DEM quality such as quality of interpolation method, comparison of different sampling patterns, or quality of source data. A more holistic approach is required, which considers all these factors, and evaluates the quality of the final DEM with respect to the original terrain surface.

A second limitation of previous research is the inconsistent and incomplete way that DEM quality has been assessed and described. Authors have adopted one, or maybe two, of three approaches to quality assessment: visual assessment, geomorphometric characterisation, and estimating accuracy. The most commonly employed approach is assessment of accuracy. It is assumed that measuring elevation error is a useful way of assessing DEM quality. The clear statistical evidence provided by error quantification can be more attractive and useful in computing terms than, for example, the general quality description provided by DEM visualisation. However, the purpose of a DEM is rarely to simply provide elevation values across the area concerned. Rather the DEM is the basis for higher order analysis such as modelling of processes and spatial relationships acting over or present on the terrain surface. For this reason, the quantification of elevation error may be of limited use. There is no reason to assume that a DEM which provides a “closer to reality” digital representation of elevation values will provide a more realistic model of hydrological, geomorphological or climatological processes.

Chapter 3 – Assessing DEM Quality

46

There is no knowledge of how one approach to quality assessment compares to another: do they reveal similar aspects of a DEM‟s quality and which approach gives a clearer description? Adopting all three approaches and providing a variety of quality measures would seem to have the potential to give the fullest description of quality.

A comprehensive assessment of DEM quality is required, involving DEMs produced from different source data, using varying sampling patterns and created with different interpolation methods. There is also a need to develop a more thorough and comprehensive method of assessing and reporting DEM quality.

The absolute quality of a DEM is related to the intended purpose. The research presented here is within the context of environmental modelling of mountain regions. However, with the intention of developing a generic, rather than application specific, quality assessment strategy, quality is considered in relative rather than absolute terms. For example, DEM A may be determined to be of higher quality than DEM B, but no judgement is made as to whether either DEM A or DEM B is of appropriate or sufficient quality.

The aim of the research presented in this chapter is to undertake a thorough and systematic assessment of DEM quality. This will allow the influence of the various DEM quality factors to be identified and allow a holistic DEM quality assessment methodology to be determined.

The objectives are: To undertake quality assessment of a variety of DEMs, using visual, geomorphometric and accuracy approaches; To compare quality assessment approaches and recommend a holistic approach to DEM quality assessment; and, To compare the quality of the various DEMs to investigate how different factors influence DEM quality.


47

3.2 Introduction to the Assessment of DEM Quality The literature reveals three broad approaches to assessing DEM quality: visual assessment; geomorphometric characterisation; estimating accuracy. The approaches vary in their degree of objectivity and use of quantitative methods, and also in how much information about DEM quality they reveal.

3.2.1 Visual Assessment of DEM Quality Visualisation provides useful methods for understanding spatial data in general (Hearnshaw & Unwin, 1994) and also for assessing data quality (Goodchild et al., 1994). A number of visualisation techniques can be applied to DEMs and their derivatives to help assess DEM quality. Visualisation represents a subjective and qualitative approach to quality assessment. Findings are dependent on how the user chooses to visualise a DEM and these findings are conveyed descriptively.

3.2.1.1 Two-Dimensional DEM Rendering Two-dimensional rendering allows the end user to examine the range and distribution of elevation values within a DEM. The fidelity of this representation depends on the number of colours in the palette (Fig. 3.1). Too few colours do not give enough detail (Fig. 3.1a), while too many colours can make the details hard to discern (Fig. 3.1c). However, this technique is not very discriminating when applied to DEMs. Only a general impression of topography is given and in terms of quality assessment only major blunders can be identified (Wood, 1996). a)

b)

c)

Fig. 3.1 Raster rendering of a DEM. a) using a 4 colour palette; b) using a 16 colour palette; c) using a 256 colour palette.


48

3.2.1.2 Orthographic Display An orthographic display, also known as pseudo 3D projection, emulates an oblique, perspective view of the DEM surface. The amount of information conveyed by this technique depends on the viewing direction and the vertical exaggeration (Wood, 1994). Wood (1996) states that the „fishnet‟ display is useful for detecting errors (Fig. 3.2a). Additionally, a second variable or data set can be draped over the surface (Fig. 3.2b). The realism of a hillshade map or aerial photograph draped over an orthographic DEM display can take advantage of our ability to make visual sense of images and therefore reveal parts of the surface that differ from the expected (Wood, 1994). The rendering of DEM derivatives, described in §3.2.1.3, could also be draped to good effect. Bolstad & Stowe (1994, p.1328) describe using such visualisation techniques to check for “reasonableness, conformance to general knowledge of terrain shape, and geomorphic consistency (e.g. connected stream channels, ridges)”.

Fig. 3.2 Orthographic displays. a) fishnet display; b) aerial photograph drape.

3.2.1.3 DEM Derivatives Rendering DEM derivatives can provide more useful information on quality than rendering the DEM. Terracing can be identified from rendered gradient and curvature maps (Fig. 3.3a), while rendered aspect maps and hillshade maps can help spot flat peaks, ramps and other interpolation artefacts (Fig. 3.3b; Acevedo, 1991; Carrara, et al., 1997; Wood, 1994; Wood, 1996). Giles & Franklin (1996) used a gradient map to identify spatially autocorrelated noise in a DEM, portrayed as a series of pits and hummocks in the image. As mentioned above, draping these rendered maps over an orthographic display of the DEM can help.


49

a)

b)

Fig. 3.3 Rendered DEM derivative maps. a) gradient map showing terracing as narrow bands of steep gradients (white/light grey); b) aspect map showing flat areas (black) and ramps (most pronounced to upper left of large circular flat area).

3.2.2 Quality Assessment by Geomorphometric Characterisation Measurement of surface form provides several techniques for assessing quality. Evans (1972 & 1981) developed a range of geomorphometric variables for giving a comprehensive and quantitative description of surface form in the field of landform assessment. Geomorphometric variables can quantify some of the issues and artefacts revealed by visualisation, allowing objective comparison of DEMs. However, this approach to quality assessment seems to be rarely undertaken.

Carrara et al. (1997) and Wood (1996) use frequency histograms to identify terracing as an artefact of interpolation. A frequency histogram of a DEM‟s elevation values is often “spiky”, because of the terracing effect (Fig. 3.4a; Carrara et al., 1997). Wood (1996) quantifies the degree of spikiness by calculating the modulus of the contour interval for all cells of the DEM, plotting these as a “hammock plot” (Fig. 3.4b) and then calculating a “Hammock index”. Some landscapes are naturally terraced to some extent, or due to agricultural practices. Therefore, a “spiky” histogram does not necessarily imply a poorer quality DEM. Natural terracing would only cause a high Hammock Index if the terraces happened to occur at the same elevations as the contour intervals. Measurements of surface form must be interpreted within the context of the characteristics of the land surface that is being represented.


50

a)

b)

Fig. 3.4 Identifying DEM terracing: a) elevation frequency histogram; b) “hammock plot” of the modulus of the contour interval. These techniques seem useful for identifying DEMs that are suitable for particular application fields. For instance, geomorphometrics could give information about how smooth the surface is, how many spurious pits are present, how large the pits are, and whether there is a bias to the distribution of slope gradients, maybe due to quantisation problems. This information would help the user decide whether a DEM is suited to a particular purpose, such as hydrological modelling.

Examining the frequency distributions of gradient and aspect can help discern the smoothness of a DEM. Surface texture indices provide an indication of the variability and roughness of DEMs (Hartshorne, 1996). Hartshorne (1996) uses entropy (Equation 3.1) to measure surface variability and suggests that measures of fractal dimension, such as Clarke‟s (1986) triangular prism surface area method, could also be used. There are also various indices of diversity and heterogeneity, usually used in the field of landscape ecology, which could be used to quantify surface texture.


51

Equation 3.1: Surface Entropy Entropy = -sum(pi log(pi)) where pi is the proportion of cells within a 3 by 3 window of cell i with the same elevation as cell i. As described earlier, pits within the DEM surface represent one possible artefact of the interpolation process. These pits prevent many automated hydrological analyses from running and must therefore be removed. There are various algorithms for removing these pits, which are reviewed by Martz and Garbrecht (2000). The algorithms most commonly implemented in GIS packages are of the flood-filling type, which remove pits by filling them in. As with any manipulation of a DEM‟s cell values, pit removal is likely to reduce accuracy. So calculating the size of pits within a DEM is a useful indicator of DEM quality. The original DEM can be subtracted from a flood-filled DEM to calculate the volume of pits within the original DEM.

3.2.3 Estimating DEM Accuracy The most intuitive way to assess the quality of a DEM is to determine the amount of error in the elevation values. Determining error for every cell is not practical so a number of sample points are selected to compare the DEM cell values with those of the terrain surface. From this sample the characteristics of the error distribution across the whole DEM area can be estimated by measures of accuracy.

In many cases, due to time and accessibility constraints, it will not be possible to make onthe-ground measurements of the “true” elevation. Instead of determining the absolute accuracy of the DEM, it is more practical, and hence more common, to measure the relative accuracy in comparison with sample point measurements known to be of a higher order of accuracy (Ehlschlaeger & Shortridge, 1997).

There are two issues to consider when using sample points to check DEM accuracy. First, how should the sample points be selected? Second, how can measurements of a higher order of accuracy be obtained? These issues are examined below followed by a review of ways of measuring accuracy.


52

3.2.3.1 Selecting Sample Points Comparing every grid point of the DEM with those of a more accurate DEM will provide a very reliable assessment of accuracy. This scenario is rare as the DEM concerned is usually generated by the most accurate means available. Exceptions are Bolstad & Stowe (1994) and Day & Muller (1988), who subtracted the elevation values of a SPOT derived DEM from the elevation values in an aerial photography derived DEM to create residual surfaces, and Sasowsky (1992) who compared a SPOT derived DEM with one derived from a topographic map.

Usually more accurate measurements can only be acquired for a sample of the grid points. The fewer the number of samples, the more efficient, in terms of time and cost, the quality assessment exercise will be. However, fewer samples mean a less reliable quality assessment, especially in the highly variable terrain of mountain environments. So the choice of sample size is important. Li (1991) states that the optimal sample size depends partly on the heterogeneity of the terrain and partly on how reliable either the estimate of mean elevation error or the estimate of the standard deviation of the elevation error needs to be. Equations 3.2 and 3.3 are derived by Li using standard statistical theory to estimate sample size based on the required accuracy for the mean error estimate and for the estimate of the standard deviation of the error:

Equation 3.2: Sample Size Estimate with Known Required Accuracy for Estimate of Mean Error

n

Z r2

2

R2 where n is the estimated required sample size, Z r is the Z statistic for the required confidence level, is the estimated standard deviation of the elevation error and R is the required level of accuracy (or confidence). For example: say the required confidence level is 0.95 and the standard deviation of the elevation error is estimated as 5, then Zr = 1.96, so: n= ((1.96)2 * (5)2)/(0.95)2 = 4.15 * 25 = 103.75

Equation 3.3: Sample Size Estimate with Known Required Accuracy for Estimate of Standard Deviation of Error


53

n

1 2R 2

where n is the estimated required sample size and R is the accuracy of a standard deviation estimate. For example: say we want the true standard deviation to be within +/- 5% of the standard deviation estimate, then: n=1/2 *(0.05)2 = 1/2 * 0.0025 = 1/0.005 = 200 These equations require a measure of DEM accuracy (an estimate of the standard deviation of the error) prior to determining the number of samples needed to reliably measure DEM accuracy. Therefore, after selecting a number of sample points and determining DEM accuracy, the equations provide a means for checking the reliability of the accuracy estimates. Then, if necessary, the number of sample points can be increased and the accuracy estimates re-evaluated. A significant shortcoming, however, is that there is no recognition of the impact that the location and distribution of sample points can have on the reliability of accuracy estimates. No research has attempted to use spatial statistics to quantify the optimum distribution of sample points for DEM accuracy estimates. In the absence of such quantification one can only attempt to ensure that the sample points are evenly distributed across the DEM area and are located in a representative range of terrain types, i.e. in areas of steep, intermediate and gentle gradients, convex and concave curvatures, ridge and stream lines, and so on.

3.2.3.2 Sources of More Accurate Data Given that a DEM is often generated from the best quality elevation data available, obtaining a more accurate measure of elevation can be problematic. The following section considers three common sources of elevation data: topographic maps, aerial photography and ground survey.

Kumler (1994) describes 3 techniques for using topographic map elevation data to measure DEM accuracy. Although topographic maps are widely available, this means that they are commonly used to generate DEMs. Such maps can only be used as more accurate elevation data if the DEM being assessed was generated from smaller scale mapping or a less accurate source such as satellite imagery stereo-correlation (Giles & Franklin, 1996) or if not all of the map‟s elevation information was used in production of the DEM.


54

Spot heights provide the most accurate elevation measurements on topographic maps and have been used to assess DEM accuracy (Bolstad & Stowe, 1994; Shearer, 1990). However, spot heights tend to be located at special points on the surface, namely peaks or pits. It is unrepresentative to base a DEM accuracy estimate only on how well surface extremes are described (Kumler, 1994).

Contour lines can only be used as the more accurate elevation data if they have not been directly used to generate the DEM. For example, the DEM may have been interpolated from a regular or progressive grid-sampling scheme that was initially derived from topographic map contour lines. In this situation sample points lying on the contour lines can be used or a completely random point sample can be obtained by manually interpolating elevation values between the contour lines (Giles & Franklin, 1996; Eklundh & Mårtensson, 1995).

A quality assessment involving either topographic spot heights or contours only considers the quality of the sampling and interpolation procedures used. It does not incorporate any consideration for the quality of the original elevation data. Although the above techniques are easily implemented due to the availability of topographic map data, such measurements only represent partial quality assessments and can only provide limited knowledge of DEM quality.

High accuracy sample points can be stereoscopically measured from aerial photography. Where a DEM has been generated from stereoscopically derived contour lines, the measurement of individual sample elevations represents more accurate data than the onthe-fly derivation of contours. Even where the DEM has been generated from a sample of stereoscopically derived points, a different set of points can be selected for the quality assessment. However, this technique is of limited use because of its availability. Most users of “off-the-shelf” digital elevation data will not have the required photography, equipment or expertise at their disposal to be able to perform such quality checks. The only practical way to ensure that “true” measures of elevation are obtained at an unbiased sample of points is to collect data from the field. Such data may already be available, for example the United States National Geodetic Survey data used by Bolstad & Stowe (1994) or the bench mark and spot height measurements that can be purchased from


55

the Ordnance Survey. However, these existing data are both expensive to acquire and will rarely provide a good distribution of samples. Such data are usually only located at surface specific points, particularly pits and peaks, and over man-made structures, particularly roads. Usually a ground survey will have to be undertaken by the person or organisation undertaking the accuracy assessment.

Traditional ground survey methods using theodolite technology have been prohibitively time consuming and often completely impractical in the remote and rugged terrain of mountain areas. GPS provides a faster and more convenient means for ground survey (Adkins & Merry, 1994; Giles & Franklin, 1996). Such a method can be used for DEMs derived from any data source. The GPS data collection technique can be varied to suit the relative accuracy of the DEM. Sub-metre accuracy measurements can be achieved using hand-held GPS receivers that are capable of differentially correcting carrier phase signals (Heywood et al., 1999; Magellan, 1994). Less accurate GPS measurements, using less sophisticated techniques and/or equipment, may be adequate for lower accuracy DEMs. The only restriction on the location of sample points is accessibility. Therefore a more representative sample distribution can be obtained.

The high accuracy elevation measurements provided by LIDAR data may in time provide a useful source of more accurate data. However, the limited coverage and high cost are currently limiting factors.

3.2.3.3 Accuracy Measures Having obtained a sample of control points from a more accurate data source, and presuming that this sample is sufficiently large and representatively distributed, the difference between the DEM and control points elevation values can be calculated. The next consideration is how to turn this set of individual elevation errors into an estimate of the DEM‟s accuracy. The set of error measurements are samples from a frequency distribution representing the differences between two surfaces. Momental statistics, including measures of central tendency and dispersion, can be used to describe this frequency distribution (Miller & Kahn, 1962). Table 3.1 lists the more commonly used measures of accuracy and indicates which combinations five authors have implemented (Day & Muller, 1988; Eklundh & Mårtensson, 1995; Kumler, 1994; Li, 1991; Sasowsky, 1992).


56

Table 3.1 Use of momental statistics to describe DEM accuracy. Day & Eklundh & Kumler Li Sasowsky Muller Mårtensson (1994) (1991) (1992) (1988) (1995) Mean error Absolute mean error Min/max. error RMS error Standard deviation of error Percentiles Reliability† † Day & Muller (1988) use the term “reliability” to describe the percentage of points where the error - mean error lies within 3 standard deviations of 0. Table 3.1 shows that there is no standard set of statistics for reporting the quality of a DEM in terms of accuracy measures, although the RMS error is used by all 5 authors. More importantly, some of these measures are inappropriate, and even misleading, when describing the quality of the surface.

The mean error value simply describes how evenly the elevation error values are spread about zero and can highlight a systematic deviation or bias in elevation values (Wood, 1996). A lower mean error does not necessarily imply a more accurate DEM (Fig. 3.5). In order to compare two DEMs and find which on average has error values closest to zero, the absolute mean must be used. Mean error represents a systematic bias across the DEM that can easily be removed by adding to or subtracting from the cell values. The scale of the remaining non-systematic error could be described by the absolute mean of the adjusted error values, which equates to the average deviation from the mean (ADM; Equation 3.4). Alternatively, ADM can be given in the first place as a good descriptor of the scale of errors.


57

Elevation error (m)

60 40 20

DEM A

0 -20

1

2

3

4

5

6

7

8

9 10

DEM B

-40 -60 Sample point ID

The mean error of DEM A is zero. The mean error of DEM B is 8.3. DEM B is clearly more accurate than DEM A, as the individual error values are closer to zero. The higher mean error value for DEM B simply shows that there is a bias towards overestimation of elevation in this DEM. Fig. 3.5 Mean error and accuracy.

Equation 3.4: Absolute Deviation from the Mean (ADM) i N

abs(ei ADM

e)

i 0

N

where ei is the error at sample point i, e is the mean error and N is the number of sample points. The other statistical measures listed in Table 3.1 all relate to the spread of error values around the mean. The minimum and maximum error are of limited use as they only identify the greatest error within the sample and give no information about the sample as a whole. They are obviously highly susceptible to outliers. More meaningful information on the overall spread of error values is provided by the root mean squared error (RMSE; Equation 3.5), standard deviation of error and percentiles. The most commonly used of these is the RMSE. Both the Ordnance Survey (1999) and USGS (1998) describe the accuracy of their DEMs and other digital elevation data with a RMSE figure. The RMSE is easy to calculate, report and understand – it is just a single number. However, RMSE only gives a good description of error spread when the mean error is zero (Monckton, 1994; Wood, 1996). Many assessments of DEM error have found that this „mean equal to zero‟ criterion is not the case (Li, 1993a; Li, 1993b; Monckton, 1994).


58

Equation 3.5: Root Mean Squared Error (RMSE) n

z j )2

( zi 1

RMSE

n

where zi and zj are two corresponding elevation values (e.g. DEM cell value and corresponding sample point elevation) and n is the number of sample points.

Wood (1996) identifies a second problem with use of RMSE. Relative relief and scale of measurement influence the magnitude of the RMSE value, so comparison between areas is difficult. He proposes using an accuracy ratio (RMSE divided by a measure of relative relief; Equation 3.6) to remove the effects of relative relief. Equation 3.6: Accuracy Ratio (a) n

a

( zi

z j )2

( zi

zi ) 2

1 n 1

where zi, zj and n are as in Equation 3.5 and zi is the average DEM elevation. Standard deviation corrects for non-zero means and the use of this, along with a statement of the mean error, gives a more appropriate measure of DEM accuracy (Li, 1993a; Li, 1993b). However, both standard deviation and RMSE are open to influence by one or two atypical outliers. For this reason Kumler (1994) advocates use of the 90th percentiles to better characterise the spread of error values. The measure of „reliability‟ used by Day & Muller (1988) is essentially a companion to the use of percentiles as it indicates the percentage of sample points that can be assumed to be outliers.

A further problem with all of the above momental statistics as measures of accuracy is that only one value is given for the whole DEM. This suggests an underlying assumption of stationarity, i.e. that DEM accuracy is not spatially variable. This assumption is rarely true and is examined in Chapter 4.


59

3.3 Methodology 3.3.1 Study Area The area selected for this research into DEM quality comprises approximately 2km2 within the Snowdonia massif, North Wales, UK (Fig. 3.6). This area was chosen for its geomorphological variety and accessibility for fieldwork. The terrain includes a portion of a glacial trough (the Llanberis valley), two glacial cirques (Cwm Glas and Cwm Glas Mhor), the peak of Crib y Ddysgl and the eastern end of the Crib Goch Ridge. Elevation varies from 160m to 1065m above sea level. This relative relief, the extremes of gradient and aspect and the rugged nature of the terrain are characteristic of a mountainous environment. the area is snow and ice free most of the year and has no woodland cover. The aerial photography used to derive the digital elevation data (§3.3.2) has not been inspected. Shadowing may obscure some of the terrain, but snow, cloud and vegetation cover are unlikely to have an effect on data quality.


60

Fig. 3.6 The Snowdonia study area (Adapted from: EDINA, 2001)


61

3.3.2 Digital Elevation Data Two criteria have been used to select the digital elevation data used in this study. First, the data had to be representative of those accessible to organisations involved in mountain environment research. Second, it was required that the data could be prepared into a variety of formats, namely regular grids, random points and contour lines. These criteria led to the purchase of tile SH65NW of the Ordnance Survey‟s Landform Profile data. The data represent 1:10 000 scale digital contour lines at a 10m vertical interval. The Ordnance Survey also sells 1:50 000 scale 50m contour interval data (Landform Panorama) and 1:50 000 scale 50m spacing regular grid of points (Landform Panorama DTM). The 1:10 000 scale contour line data were chosen because it is the least processed of the Ordnance Survey‟s digital elevation data. The 1:50 000 contours are generalised and the regular grids are interpolated from this data. The process of generating this data set has already been described in §2.4.1.3.

At the time of purchasing the data the Ordnance Survey stated that the Landform Profile data have an accuracy of 1.5m RMSE (Ordnance Survey, 1996). More recently the accuracy of this data set is stated as 1.8m RMSE (Ordnance Survey, 1999). This measure of accuracy is of limited benefit to the user of a DEM generated from these data for the following two reasons. First, this RMSE figure applies to the whole of the UK. There is no indication of how accuracy varies over different types of terrain. Second, the RMSE only describes the accuracy of the digital contours. There is no indication of the accuracy of a DEM generated from these contours.

3.3.3 Software Two GIS software packages have been used in this research: Idrisi (Idrisi for Windows version 2.1 and Idrisi32 versions 1 and 2) and ArcView (versions 3.0 to 3.2). These PCbased packages are widely used in the field of spatial modelling for environmental and mountain studies and implement the most commonly used interpolation functions. Idrisi has linear contour and inverse distance weighting interpolation functionality. ArcView has inverse distance weighting and spline interpolation functionality. Idrisi32 and an extension to ArcView also allow Kriging interpolation to be implemented.


62

3.3.4 Data Preparation An important part of this research is to investigate the influence of different elevation data sampling patterns on the quality of the resultant DEMs. Additionally, inverse distance weighting, spline and Kriging interpolation algorithms require a point data set as input. Consequently, preparing the digital elevation data for the interpolation to generate the DEMs involved converting the contour line sampling pattern to point data and to regular grids of varying intervals. The resultant data sets were:

1. CONTOUR LINES AT 10M INTERVALS The original 10m interval contour lines from Ordnance Survey Landform Profile data set. 2. CONTOUR LINES AT 50M INTERVALS Every fifth contour extracted from the original data set. This replicates the contour interval of the Ordnance Survey Landform Panorama data set. However, it is not an exact simulation as it comes from 1:10 000 scale mapping rather than 1:50 000 scale. Therefore, the form of each contour line should be more finely detailed, potentially giving a higher quality DEM. 3. 10M CONTOUR VERTICES The original data set broken down into a series of points coincident with all of the vertices of the contour data. 4. GENERALISED 10M CONTOUR VERTICES The above data set (10m contour vertices) comprises an atypical concentration of points along the contour lines. To create a more evenly distributed point data set the fineness of detail in the original 10m contour lines was reduced using the Douglas-Peucker line generalisation algorithm then broken down into point format. In attempting to preserve the shape of a line with fewer vertices, the Douglas-Peucker algorithm will keep most vertices in regions of high planimetric curvature and tend to remove vertices from low planimetric curvature regions. This biasing of data location may influence DEM quality. However, this is not investigated in this research. The resulting data set comprised approximately half the total number of 10m contour vertices. The discarded vertices could be used for assessing DEM accuracy. However,


63

as described in §3.2.3.2, comparing DEM elevations with elevations of another data set does not completely represent “true” accuracy. To assess DEM accuracy with respect to “true” elevation, GPS elevation measurements are used here (see §3.3.8.1), rather than discarded contour vertices. 5. 10m REGULAR GRID The above data set (generalised 10m contour vertices) was judged to still represent a higher than average density of points. So the original data set was used to derive a regular grid of points with 10m grid spacing by interpolating using inverse distance weighting. It is recognised that the accuracy of any DEMs generated from this data set and the other regular grids described below will be reduced as a consequence of interpolating from interpolated data. Nonetheless, it was felt that such DEMs could still provide useful information about the influence of sampling pattern on DEM quality. 6. 30m REGULAR GRID As above except with 30m grid spacing. This replicates the sampling pattern of the USGS‟s largest scale elevation data. 7. 50m REGULAR GRID As above except with 50m grid spacing. This replicates the sampling pattern of the Ordnance Survey Landform Panorama DTM data sets.

3.3.5 DEM Generation In addition to considering the role of data sampling patterns, a second key part of this research has been to investigate the importance of interpolation functions and interpolation parameters on the quality of the resultant DEMs. This section begins by detailing the different interpolation procedures used, and how their parameters varied. This is followed by a definitive list of all the DEMs generated, using different combinations of sampling pattern and interpolation technique.


64

3.3.5.1 Interpolation Procedures 1. INTERCON (Clark Labs, 2001) Idrisi‟s algorithm for linear interpolation from contour lines. At each grid cell this algorithm generates 4 profiles using the 8 cardinal directions of the compass, then linearly interpolates along the steepest profile. 2. INTERPOL (Clark Labs, 2001) Idrisi‟s algorithm for inverse distance weighted (IDW) interpolation from point data. The user specifies the distance weighting. Weightings of 2 and 5 were used. The options for specifying the search radius are limited and unintuitive. The search can be restricted to a six-point radius, which actually means the algorithm guesses how large (in grid cells) the search radius should be in order to encounter six points. At each grid cell, the interpolation proceeds if between four to eight points are found. Otherwise the search radius distance is increased or decreased as appropriate until 4 to 8 points are chosen. Disabling this variant of the six-point search radius causes the algorithm to use all points within the data set for every grid cell. DEMs were generated using the former option, the six-point search radius. 3. INVERSE DISTANCE WEIGHTING (IDW; ESRI, 1998) ArcView‟s IDW function allows specification of the weighting and complete control of the search radius in terms of either its extent in metres or the number of points to include. Radii including 6 and 12 points were used, both with weightings of 1, 2, 3, 4 and 5. Thus a comprehensive investigation of the effect of varying inverse distance weighting parameters was possible. 4. SPLINE (ESRI, 1998) ArcView‟s spline interpolation algorithm allows control of the search radius in the same way as the inverse distance weighting algorithm. Search radii of 6 and 12 points were again used. The curvature of the surface patches is controlled by choosing regularised splines or splines with tension. For regularised splines the user specifies a weight that determines how the third derivative of the surface patch influences the minimisation of curvature. For splines with tension the specified weight determines how the first derivative influences the minimisation of


65

curvature. In effect using splines with tension creates less smooth DEMs. Regularised splines and splines with tension were used, both with the default weight of 0.1.

Kriging was considered as a fifth interpolation method. As described in §2.4.3.5, Kriging is unlikely to offer significant benefits over other interpolation techniques when applied to modelling of heterogeneous mountain terrain. This was briefly investigated by using ordinary Kriging to create one Snowdonia DEM. Kriging was found to produce a DEM of similar character and quality to inverse distance weighted DEMs (Appendix 1). Therefore, Kriging was not used as a fifth interpolation technique.

3.3.5.2 The DEMs To allow an investigation into the changes in DEM quality caused by varying data sampling patterns, interpolation techniques and GIS software used, 26 DEMs have been generated. Table 3.2 lists these 26 DEMs, giving the software package, the sampling pattern, interpolation algorithm and user-defined parameters used.


66

Table 3.2 Generated DEMs. DEM Software Sampling pattern

Algorithm

10mCont 50mCont AllVert2

Idrisi Idrisi Idrisi

INTERCON INTERCON INTERPOL

AllVert5 GenVert2

Idrisi Idrisi

GenVert5 10mGrid2

Idrisi Idrisi

10mGrid5 30mGrid2

Idrisi Idrisi

30mGrid5 50mGrid2

Idrisi Idrisi

50mGrid5 IDW16

Idrisi ArcView

IDW26 IDW36 IDW46 IDW56 IDW112 IDW212 IDW312 IDW412 IDW512 SpReg6 SpReg12 SpTen6 SpTen12

ArcView ArcView ArcView ArcView ArcView ArcView ArcView ArcView ArcView ArcView ArcView ArcView ArcView

10m contours 50m contours All contour vertices " Generalised contour vertices " Regular grid, 10m spacing " Regular grid, 30m spacing " Regular grid, 50m spacing " Generalised contour vertices " " " " " " " " " " " " "

Search Weighting (for IDW) / Radius Curvature Constraint (for spline) n/a n/a n/a n/a 2 6

INTERPOL INTERPOL

6 6

5 2

INTERPOL INTERPOL

6 6

5 2

INTERPOL INTERPOL

6 6

5 2

INTERPOL INTERPOL

6 6

5 2

INTERPOL IDW

6 6

5 1

IDW IDW IDW IDW IDW IDW IDW IDW IDW Spline Spline Spline Spline

6 6 6 6 12 12 12 12 12 6 12 6 12

2 3 4 5 1 2 3 4 5 Regularised Regularised Tension Tension

3.3.6 Visual Assessment of DEM Quality A gradient image and an aspect image were derived from each DEM. For each DEM three images were displayed with grid cells shaded according to elevation, gradient and aspect values. An orthographic display of the terrain surface was also produced. These images were examined to assess the representation of major landform features, the level


67

of detail and general nature of the DEMs‟ representation of the terrain surface and the presence of major blunders and interpolation artefacts.

3.3.7 Quality of Geomorphometric Characteristics The distribution of elevation, gradient and aspect values were analysed and the volume of pits within each DEM was calculated (Equation 3.7). This provides quantitative measures of the geomorphometric characteristics of each DEM that represent quality indices. Such assessment indicates how appropriate a DEM would be for applications such as hydrologic simulation or landform representation. Equation 3.7: DEM Pit Volume (V) n

V

a*

fzi

zi

1

where a is the area covered by one grid cell, n is the total number of cells within the DEM, fzi is the elevation of the filled DEM at cell i and zi is the elevation of the unfilled DEM at cell i. Important characteristics of the distribution of elevation, gradient and aspect values were quantified by calculating the percentage of grid cells within each DEM possessing the following key values:

1. The percentage of cells within the DEM with an elevation value equal to the contour interval of the source data, i.e. multiples of 10m or 50m. This quantifies the degree of bias towards the source data.

2. The percentage of cells within gradient images derived from the DEMs with a value of zero, i.e. flat cells. This percentage is affected by the degree to which the DEM surface is terraced and the presence of flat-topped peaks and flatbottomed valleys.

3. The percentage of cells within aspect images derived from the DEMs with a value representing the cardinal points of the compass, i.e. values of 0o, 90o, 180o, 270o or 360o. This indicates how “blocky” a DEM surface representation is.


68

The values of these percentages for a particular study area will be affected by the characteristics of the terrain as well as DEM quality. For example, a study area with much flat terrain should have a high percentage of cells with a gradient of zero. Consequently, the above parameters have no generic ideal value and they do not permit comparison of the quality of DEMs for different areas. They do allow comparison of DEMs covering the same study area. An ArcView DEM Uncertainty and Quality extension (“DEMUncQual.avx”; Appendix 2) was created to automate many of the methods involved in this research. Loading the extension adds a new menu named “DEM qual. & unc.” to ArcView‟s View menu bar (Fig. 3.7). This menu provides access to a number of scripts that automate processing steps. One such script, “DEMUncQual_GeoIndices.ave”, automates the processing steps required to derive the geomorphometric indices from a DEM. Other scripts are described later in the thesis.

Fig. 3.7 The DEM Uncertainty and Quality menu.

3.3.8 DEM Accuracy To determine DEM accuracy, first, high accuracy GPS measurements of elevation were collected for calculating error and, second, momental statistics were selected to describe a DEM‟s accuracy.


69

3.3.8.1 Collection of GPS Measurements Two Magellan ProMark X GPS receivers were used to collect differentially corrected carrier phase data, one receiver being used as a base station and the other as a rover (Carlisle & Jordan, 1998). Position was recorded as Ordnance Survey grid coordinates. Elevation was recorded as height in metres above the Ordnance Survey‟s 1936 datum (OSGB36). Magellan (1994) states that sub-metre accuracy data (RMSE = 0.9m) can be acquired with these receivers. This was deemed sufficiently accurate for the purpose of measuring the elevation error of the DEMs and was verified by surveying three Ordnance Survey triangulation points of known position and elevation on four occasions. Eight of the twelve measurements lay within 0.9m of the known position. The RMSE of elevation values was 0.83m. However, there was a non-zero mean error of 1.3m, signifying a bias in the GPS measurements. This is assumed to be due to inaccuracy in the transformation from the WGS84 datum used internally by the GPS receivers and the OSGB36 datum. Datum transformation algorithms use a single set of equation parameters that account for the average difference between two datums. The difference is in reality spatially variable. Therefore local corrections to the average transformation can be applied to improve accuracy. The mean error of 1.3m was subtracted from all subsequent GPS elevation measurements, giving sufficiently accurate GPS measurements of position and elevation.

In order to obtain a reliable measure of DEM quality from the elevation error at a sample of control points, it is important that a sufficient number of points are sampled and that the distribution of the control points over the study area is representative of the terrain.

Li (1991) provides the only guidance regarding the number of samples to use as described in §3.2.3.1. The equations he derives for estimating sample number require an initial estimate of the elevation error‟s standard deviation. This figure was unknown at the time of the GPS survey. Therefore it was assumed that a sample of over 100 points should be adequate to provide a reliable estimate of DEM elevation error. Subsequently it was found that the 106 points actually surveyed would provide a 95% reliable estimate of mean elevation error for DEMs with an elevation error standard deviation of 5m or less and an estimate of the elevation error‟s standard deviation within about 7% of the true value. This was deemed acceptable.


70

With respect to the distribution of the sample points, it was at first hoped that a stratified random sampling technique could be used to compile a list of sample point coordinates at which to carry out the survey. However, on trying to use GPS to locate these sample points in the field, this sampling scheme was abandoned for two reasons. First, a number of sample points were found to be in difficult to access or totally inaccessible areas. Despite the fitness and mountaineering experience of the author, some points were in areas that were too steep or rugged to cross safely (Fig. 3.8). Second, using GPS for navigation without real-time differential correction proved time consuming and unreliable. Each point could only be located to within no less than 100m. The choice of the exact final location would be a subjective decision. Even if all points were accessible it was difficult to determine a practical and efficient route between points. It was decided that traversing over a variety of terrain types and surveying points at fixed distance intervals was practical and efficient and would give an equally representative sample.


71

a)

b)

Fig. 3.8 GPS Fieldwork: a) the steep and rocky terrain of the Snowdonia study area. Looking south west from the slopes of Glyder Fawr in the foreground to Crib y Ddysgl; b) Surveying on the steepest and most rugged ground that was safely possible – on the side of Cwm Glas looking out over the Llanberis valley.


72

3.3.8.2 DEM Accuracy Measures For each sample point the elevation value of the corresponding grid cell from each of the 26 DEMs was extracted. The difference in elevation between the grid cell value and the sample point‟s GPS-measured elevation was calculated to give an elevation error (elevation error = DEM elevation - GPS elevation). A positive error value indicates that the DEM provides an over-estimate of elevation at that sample point‟s location, while a negative error indicates an underestimate.

Based on the information presented in §3.2.3.3, the following accuracy statistics were calculated for each DEM to give a comprehensive quantitative assessment of a DEM‟s errors, which includes a summary description of the size of the error values and the statistical quality of the error distribution:

mean elevation error; average deviation from the mean (ADM); standard deviation of the elevation errors; 90th percentile of the spread of elevation error values; and, reliability: the number of sample points where the elevation error exceeds 3 standard deviations of the mean error.

The DEM Uncertainty and Quality extension to ArcView includes a script (“DEMUncQual_AccMeasures.ave”; Appendix 2) that automates the processing steps required to derive the accuracy measures from a DEM.

3.3.9 Comparison of Quality Assessment Approaches The

differences

between

the

quantitative

measures

of

DEM

quality,

the

geomorphometric quality indices and the accuracy measures, were analysed to determine the most efficient, yet comprehensive, way of describing a DEM‟s quality. For each quantitative measure, values for the 26 DEMs were ranked. Correlation and regression analyses were performed on these ranked scores.


73

3.3.10 Comparison of DEM Quality The causes and extent of decreased DEM quality were investigated by considering the following questions:

A.

DATA DISTRIBUTION ISSUES: 1. Which digital elevation data sampling pattern is best: contour lines, contour vertices or a regular grid? 2. How does decreasing the contour interval affect DEM quality? 3. How does reducing the number of contour vertices influence DEM quality? 4. How does increasing grid spacing influence DEM quality?

B.

INTERPOLATION ISSUES: 1.

Which interpolation method is best: linear, inverse distance weighting or spline?

2.

How does increasing the search radius from 6 to 12 points influence DEM quality?

3.

How does the weight used in inverse distance weighting influence DEM quality?

4.

What difference is there between regularised splines and splines with tension in terms of DEM quality?

C.

SOFTWARE ISSUES: 1. Does the quality of inverse distance weighting DEMs vary between the ArcView and Idrisi packages?


74

3.4 Results 3.4.1 Visual Assessment 3.4.1.1 Elevation Renderings and Orthographic Views Displaying the DEMs with grid cells shaded according to their elevation allows a cursory inspection of the major landforms represented. All DEMs represented the largest features, such as the cirques (Cwm Glas and Cwm Glas Mhor), the Llanberis Valley, the Crib Goch ridge and the general shape of the study area terrain (Fig. 3.9c). DEMs generated from less dense sampling patterns (the 30m and 50m grids and the 50m contours) appear to give a more generalised surface representation as would be expected. These DEMs seemed to incorporate fewer finer details.

To progress any further than this cursory inspection involved examining small areas of the DEM displayed as orthographic views. Below is an outline of the DEM characteristics revealed by this examination. The diagrams illustrating these characteristics are all taken from two 100m2 areas of the study area, which provide good examples of the differences between the DEMs (Fig. 3.9). Area 1 comprises low rock outcrops on a relatively uniform underlying slope, with an abrupt step running across the lower part of this slope (Fig. 3.9a). Diagrams from this area are black and white. The coloured diagrams are from Area 2, which covers more variable terrain within Cwm Glas comprising part of the Llyn Glas tarn, two rocky mounds and an abrupt slope (Fig. 3.9b). The view direction of all orthographic views is towards the southwest.


75

Fig. 3.9 Locations of Area 1 and Area 2: a) location of Area 1 shown on 1:25 000 topographic map; b) location of Area 2 shown on 1:25 000 topographic map (Ordnance Survey, 1998); c) both areas marked on display of digital contours of the whole study area. Below are summaries of the type of DEM and the features represented for each variant of the DEM generation techniques, i.e. the different elevation data sampling patterns, the different interpolation algorithms and, in the case of inverse distance weighting, the different interpolation weighting functions.

SPLINE INTERPOLATION (Fig. 3.10) Spline interpolation produces smooth, rounded surfaces. This appears to over-simplify the ruggedness of the mountain terrain. Nonetheless, smaller surface details are represented, particularly the two mounds in Area 2, while the abrupt step in Area 1 is less clear, although present. Additionally, the underlying shape of the surface is well portrayed. However, in Area 2 there is a pit in the lake and no representation of the land damming back the lake.


76

a)

b)

Fig. 3.10 DEM produced using spline with tension interpolation and a 12 point radius: a) Area 1; b) Area 2. LINEAR INTERPOLATION FROM CONTOUR LINES Linear interpolation of the 10m contours (Fig. 3.11) gives a good representation of Area 1, showing clearly the abrupt step. The overall shape of the surface is more planar than the spline generated DEMs. The linear technique does not perform so well in Area 2, where a number of artefacts are apparent. These include the trench running through the lake, the plateau on top of the left hand mound and the ramp-like features of the right


77

hand mound. However, the lake surface is smooth and there is only a narrow outlet for the water.

a)

b)

Fig. 3.11 DEM produced using linear interpolation of 10m contours: a) Area 1; b) Area 2. The DEM interpolated from 50m contours (Fig. 3.12) clearly shows the importance of using an appropriate density of elevation data. At the 50m contour interval, the representation of Area 2 is evidently very poor, just an assortment of planar surfaces. Performance is better in the less heterogeneous Area 1, giving an approximation of the


78

underlying surface trends, although the snake-like feature on the lower slopes is anomalous.

a)

b)

Fig. 3.12 DEM produced using linear interpolation of 50m contours: a) Area 1; b) Area 2. INVERSE DISTANCE WEIGHTING From a visual assessment the DEM generated from all contour vertices using Idrisi‟s INTERPOL routine and a weighting of 2 seems to be the best of the inverse distance weighting DEMs (Fig. 3.13). This DEM shows the underlying form of the surface and includes representation of the smaller features, particularly the step in Area 1. However, there are also artefacts present, namely the noisy hummocks seen in the visualisation of


79

Area 1, and in Area 2, a plateau on the right hand mound, no barrier to the lake‟s water and a terrace in the slope dropping away from Llyn Glas. The noisy hummocks are evidence of a bias towards the height of the contour vertices. a)

b)

Fig. 3.13 DEM produced from all contour vertices using Idrisi‟s INTERPOL routine with a weighting of 2: a) Area 1; b) Area 2. The performance of DEMs generated using inverse distance weighting is heavily dependant on both the chosen interpolation parameters and the elevation data distribution. For Area 1 the surface of the DEM generated from all contour vertices with the INTERPOL routine using a weight of 5 is of poorer quality than the one described above, with “puddles” of near equal elevation around the data points (Fig. 3.14). This is


80

due to the use of a distance weighting of 5, as opposed to 2. These puddles obscure the general slope and the abrupt step.

Fig. 3.14 View of Area 1 for DEM produced from all contour vertices using Idrisi‟s INTERPOL routine with a weighting of 5. With respect to the density of elevation data, the DEM generated from a regular 50mGrid using an inverse distance weighting of 2 clearly shows the generalising effect of reducing the amount of input data in Area 2 (Fig. 3.15b). However, a rough approximation of the surface form remains and the degradation is less drastic than is found in the DEM derived from 50m contours (Fig. 3.12). This is less true for Area 1, where even though only a distance weighting of 2 has been used, the puddling of elevation values around the data points obliterates even a rough approximation of the surface form (Fig. 3.15a).


81

a)

b)

Fig. 3.15 DEM produced from regular 50m grid of points using Idrisi‟s INTERPOL routine with a weighting of 2: a) Area 1; b) Area 2. The DEMs presented above give a good indication of the main findings of the visualisation exercise. The ArcView inverse distance weighting DEM visualisations are not shown as they reveal similar issues to those presented, i.e. a higher weight gives rise to more severe puddling, and flat peaks, terraces and troughs are created.


82

3.4.1.2 Rendering Gradient and Aspect Images Rendering gradient and aspect images with a grey scale palette clearly shows artefacts of the data sampling patterns and interpolation methods. Images representative of the main findings are shown below. In the gradient images gentle gradients are displayed as dark grey / black and steep gradients as light grey / white. In the aspect images, aspect values of near to 0o are displayed as dark grey / black and aspect values of near to 360o are displayed as light grey / white. While not being ideal for conveying aspect information, this linear colour scheme is more effective than a circular palette at portraying interpolation artefacts. The images are of the whole study area.

The smooth surfaces generated by spline interpolation lead to a reasonably even distribution of gradient values, although narrow light coloured bands of steep gradient seen in Fig. 3.16a show some degree of terracing. The aspect image shows no signs of anomalies (Fig. 3.16b). a)

b)

Fig. 3.16 Derivatives of DEM produced using spline with tension and a 12 point search radius: a) gradient; b) aspect. Linear interpolation of 10m contours gives a similar distribution of gradient values, but some light coloured diagonal, vertical and horizontal streaks are evidence of the artefacts of the interpolation algorithm‟s search directions (Fig. 3.17a). The aspect image (Fig.


83

3.17b) appears to show more surface detail than that of the smooth spline interpolators (Fig. 3.16b). However, interpolation artefacts show up as vertical, horizontal and diagonal lines towards the lower right corner. a)

b)

Fig. 3.17 Derivatives of DEM produced using linear interpolation of 10m contours: a) gradient; b) aspect. Linear interpolation of the 50m contours gives rise to more widespread interpolation artefacts, which are clearly shown in both the gradient and aspect images (Fig. 3.18).


84

a)

b)

Fig. 3.18 Derivatives of DEM produced using linear interpolation of 50m contours: a) gradient; b) aspect. Gradient images of DEMs produced using inverse distance weighting clearly show the terracing effect as linear bands of steep gradient interspersed by flat areas (Fig. 3.19a). In fact the gradient image looks very similar to a display of the original digital contour lines. The slightly speckled character of the aspect image in Fig. 3.19b is evidence of the “puddle” effect identified earlier as an artefact of inverse distance weighting.


85

a)

b)

Fig. 3.19 Derivatives of DEM produced from all contour vertices using inverse distance weighting with a weight of 2: a) gradient; b) aspect. The narrower steep bands of Fig. 3.20a show that increasing the inverse distance weight makes the terracing more pronounced. The increase in speckling in Fig. 3.20b shows that increasing the weight also exacerbates the “puddle” effect.


86

a)

b)

Fig. 3.20 Derivatives of DEM produced from all contour vertices using inverse distance weighting with a weight of 5: a) gradient; b) aspect. The potential influence of the sampling pattern on resultant DEMs is clearly shown in Fig. 3.21. It is significant that, even after interpolating contours to a regular 50m grid and then interpolating a DEM from the grid, a certain degree of terracing is still evident in Fig. 3.21a. Fig. 3.21b shows the blocky nature of the surface derived from the regular grid.


87

a)

b)

Fig. 3.21 Derivatives of DEM produced from a regular 50m grid of points using inverse distance weighting with a weight of 2: a) gradient; b) aspect.

3.4.2 Geomorphometric Characteristics Table 3.3 shows the values of the four geomorphometric quality indices calculated for each DEM. These indices are: contour bias: the percentage of grid cells within 0.5m of the contour interval; flatness index: the percentage of grid cells with gradient values of less than 0.5%; blockiness index: the percentage of grid cells with aspect values within 0.5o of cardinal directions; and, pit volume: the volume of pits (m3) found and filled by Idrisi‟s pit removing routine.


88

Table 3.3 Geomorphometric quality indices. DEM Contour bias Flatness Blockiness Pit volume index index 10mCont 11.90 1.42 19.74 30401 50mCont 3.16 0.79 16.18 196953 (≡ 15.80)† AllVert2 41.52 26.37 15.24 13167 AllVert5 58.01 19.96 15.24 2411 GenVert2 28.93 26.00 31.52 20378 GenVert5 58.01 19.96 6.88 3400 10mGrid26 25.57 7.31 6.00 3118 10mGrid56 22.53 30.85 21.30 241 30mGrid26 17.34 1.28 2.63 18227 30mGrid56 36.89 44.83 32.00 3596 50mGrid26 14.33 0.46 1.84 47140 50mGrid56 32.75 13.92 3.67 4618 IDW16 17.57 7.50 2.12 12913 IDW26 28.09 7.57 1.73 14768 IDW36 40.54 8.86 1.89 10228 IDW46 50.50 13.10 2.63 5856 IDW56 58.04 19.10 3.73 3267 IDW112 13.30 1.82 1.59 15467 IDW212 23.19 1.91 1.36 23326 IDW312 36.79 3.62 1.63 17399 IDW412 48.08 8.74 2.61 9940 IDW512 56.53 15.77 3.94 5296 SpReg6 19.2 1.8 0.9 26746 SpReg12 16.1 1.5 0.7 28037 SpTen6 15.3 1.3 1.6 12721 SpTen12 11.9 1.0 1.3 12892 †Multiplying the 50mCont percentage by 5 makes it comparable with the other DEMs, which were all derived from 10m contour interval data. The results for contour bias and flatness index suggest that the DEMs can be classified as falling into one of two categories: the first where the contour bias is below 20% and the flatness index is below 4%; and, a second, where the contour bias lies between 22% to 60% and the flatness index is greater than 7%. The DEMs in the first group show little statistical evidence of a bias towards the original data values and hence no significant terracing. These DEMs are: those generated from the contour lines (by linear interpolation); those generated from a 30 m or 50 m interval grid of points where an inverse distance weight of 2 is used; those generated from the generalised contour vertices where an inverse distance weight of 1 is used; and,


89

those generated using spline interpolation (from the generalised contour vertices).

All other DEMs show a notable bias towards the contour interval and can be considered distinctly terraced. They have all been generated using inverse distance weighting interpolation. This does not necessarily mean that IDW is a poor interpolator. Rather, the uneven data distribution

(excessively dense along contour lines, but no sampling

between contour lines) probably has an important role in causing the interpolator to behave in this way. However, it is apparent that inverse distance weighting interpolation is susceptible to problems of data distribution and accordingly should be used with care.

The IDW DEMs give evidence of a number of additional issues. First, increasing the inverse distance weighting clearly increases the contour bias. This makes sense as with higher weightings the interpolation depends more and more on the value of the one nearest data point. Second, a larger search radius reduces the contour bias, but not as much as by reducing the weighting. Third, using fewer data points, in this case by applying line generalisation to simplify the source contours, can reduce the degree of contour bias.

The values for the blockiness index raise four main points. First, there is no evidence of blockiness in any of the DEMs generated in ArcView which were derived from linegeneralised contour vertices using either inverse distance weighting or spline interpolation. ArcView interpolation techniques have performed well in this respect.

The DEMs generated in Idrisi show a varying degree of blockiness. Both DEMs interpolated linearly from contour lines show a significant degree of blockiness, with index values of 19.74% and 16.18%. This is likely to be a reflection of the interpolation algorithm, rather than the source data, which searches in the 8 cardinal compass directions.

The Idrisi DEMs generated from contour vertices using inverse distance weighting also show a degree of blockiness. This contrasts with the inverse distance weighted DEMs generated in ArcView and suggests that Idrisi employs a less sophisticated inverse distance weighting algorithm. There is an interesting difference in the degree of blockiness between the DEMs generated from all the contour vertices and those


90

generated from just the line-generalised vertices. It seems that using all the vertices gives a data sample which is sufficiently dense to partially suppress the tendency for the Idrisi algorithm to favour the cardinal compass points, while the line-generalised vertices sampling pattern provides enough space between sample points for the interpolation bias to exert a greater influence.

The blockiness index values for the Idrisi DEMs derived from a regular grid of points give an interesting twist to the previous point. Blockiness is evident in the 10m and 30m grid DEMs similar to that of the contour vertices DEMs. However, the 50m grid DEMs show no evidence of this bias. So with a grid based arrangement of sample points a denser sampling pattern causes a greater degree of blockiness. This is the reverse of what has just been observed in the contour vertices DEMs. This apparent inconsistency can be partly ascribed to the difference in the spatial distribution of data for the two groups of DEMs and partly to the sparseness of the 50m grid data. Due to the regular grid arrangement of data points, the bias in Idrisi‟s interpolator is exacerbated even at high data densities. The 50m grid DEMs are derived from the least dense data set. At this density data points are evidently too widely spaced for the interpolator bias to remain apparent.

The pit volume figures also raise points regarding the quality of the source data and interpolation methods. A comparison of interpolation methods‟ tendency to create pits can be made by comparing figures for DEMs with similar source data. The DEM created from linear interpolation of 10m contours (10mCont: 30400m3) has nearly three times the volume of pits as the DEM created from inverse distance weighting with a weight of 2 applied to all contour vertices (idwall2: 13167m3). So the contour interpolator has a greater tendency to create pits than inverse distance weighting. Of the interpolations from generalised contour vertices, regularised spline interpolation leads to higher pit volumes than inverse distance weighting. However, using spline with tension reduces pit volumes to levels similar to the highest pit volumes for inverse distance weighted DEMs. So linear contour interpolation causes the greatest pit volumes, followed by regularised spline interpolation, then spline with tension and inverse distance weighting causes the lowest pit volumes. The susceptibility of regularised spline interpolation to producing significant pits contrasts with Mitasova et al. (1996) and Eklundh & Mårtensson (1995) who found that spline methods were less prone to spurious pit problems.


91

Considering the pit volumes of inverse distance weighting DEMs, it can be seen that increasing the weighting reduces pit volumes, while increasing the search radius increases pit volumes.

Finally, the distribution of source data has an influence on pit volumes too. By considering a regular 10m grid (10mg26: 3118m3) compared to a regular 50m grid (50mg26: 47140m3), 10m contours (10mConts: 30400m3) compared to 50m contours (50mConts: 196953m3), and all contour vertices (idwall2: 13167m3) compared to generalised contour vertices (idwlg2: 20378m3) it is clear that lower density source data leads to greater pit volumes. By examining images showing the location of the pits, it is evident that the pits mainly occur where flat areas are surrounded by steeper gradients except for a narrow outlet, such as a glacial cirque or smaller features of similar form. As the density of source data is decreased, definition of these narrow outlets is lost and the volume of pits increases.

3.4.3 DEM Accuracy The five measures of DEM accuracy for all 26 DEMs are shown in Table 3.4.


92

Table 3.4 DEM Accuracy Measures. DEM Mean ADM Standard deviation 90th percentile Reliability 10mCont -2.77 2.41 3.86 4.11 4 50mCont -3.53 6.56 9.18 15.64 1 AllVert2 -2.52 3.08 4.41 6.16 2 AllVert5 -2.87 2.84 4.06 6.42 2 GenVert2 -2.93 2.86 4.27 6.70 3 GenVert5 -2.87 2.84 4.06 6.30 2 10mGrid2 -3.06 2.79 3.93 6.12 2 10mGrid5 -2.55 3.08 4.28 5.67 2 30mGrid2 -3.25 6.84 10.32 14.68 3 30mGrid5 -2.28 4.54 6.56 9.06 3 50mGrid2 -3.95 6.02 8.48 11.58 2 50mGrid5 -3.52 6.08 9.27 12.70 2 IDW16 -3.08 3.10 4.47 7.41 3 IDW26 -2.95 2.81 4.17 6.97 3 IDW36 -2.87 2.70 4.03 7.13 3 IDW46 -2.83 2.70 3.96 7.00 3 IDW56 -2.80 2.74 3.95 6.68 2 IDW112 -2.72 3.74 5.11 8.11 1 IDW212 -2.76 3.13 4.45 7.19 2 IDW312 -2.78 2.86 4.14 6.98 3 IDW412 -2.78 2.77 4.01 6.82 3 IDW512 -2.77 2.79 3.96 6.71 3 SpReg6 -2.74 2.32 3.82 4.13 4 SpReg12 -2.77 2.40 3.88 4.25 4 SpTen6 -2.72 2.29 3.77 3.96 4 SpTen12 -2.60 2.28 3.78 3.96 4 The values for mean error show that all DEMs underestimate elevation in comparison to the GPS measurements, by between 2.28m to 3.95m. This consistent bias indicates that, despite verification of the GPS accuracy and adjustment of the transformation from the WGS84 to OSGB36 datums, there is still a discrepancy of about 2.5m between GPS elevation measurements and DEM elevations. The cause of this bias is unknown.

Based on the degree of variation between DEMs for each accuracy measure, the amount of detail given by each measure is variable. The mean error values are closely grouped, while average deviation from the mean, standard deviation and then 90th percentile distinguish to a greater and greater extent between the DEMs. The reliability measure only takes one of four integer values and is therefore not very discerning.


93

3.4.4 Comparison of Quality Assessment Approaches Correlation analysis has been used to assess which of the quantitative approaches to quality assessment give unique information that is not duplicated by other measures or indices. Table 3.5 shows the results of the correlation analysis.

Pit volume

Flatness index

Contour bias

Reliability

90th%

SD

1.00 0.33 1.00 0.37 0.99 1.00 0.30 0.96 0.94 1.00 0.17 -0.47 -0.37 -0.54 1.00 -0.11 -0.24 -0.28 -0.10 -0.20 1.00 -0.06 -0.06 -0.09 -0.06 -0.20 0.54 1.00 -0.04 0.09 0.06 0.05 -0.12 0.06 0.72 -0.15 0.49 0.47 0.54 -0.30 -0.35 -0.34

Blockiness index

Mean ADM SD 90th% Reliability Contour bias Flatness Blockiness Pit volume

ADM

Mean

Table 3.5 Correlation between quality measures

1.00 0.13

1.00

A high correlation between two variables indicates that these variables give similar information about the quality of a DEM. The strong correlations (R > 0.9) between average deviation from the mean, standard deviation and 90th percentiles show that these three measures describe similar characteristics of a DEM‟s error distribution. There is a strong case for only one of these three accuracy measures being required in a DEM quality assessment. The blockiness and flatness indices are moderately correlated (R = 0.72). One of these two geomorphometric indices is probably redundant. Reliability and pit volume have slight correlations (R > 0.3) with most other variables and may be redundant.

For each of the nine quality measures, values were standardised to a range of 0 (highest quality) to 100 (lowest quality). Then, these standardised values were added to give an accuracy score and a geomorphometric quality score.

Regression analysis was used to further explore the contribution that each geomorphometric index makes to the geomorphometric quality score, and likewise for accuracy measures and the accuracy score. Stepwise regression was used to identify the


94

quality measure that best reflected the distribution of the scores, then the two best quality measures, then the three best, and so on. The results of this analysis are shown in Table 3.6. Table 3.6 Results of stepwise regression analysis of quality measures and geomorphometric and accuracy scores. ACCURACY QUALITY SCORE Number of Variables variables

Regression coefficient (R2)

1

Standard deviation

0.926

2

The above + reliability

0.960

3

The above + mean

0.992

4

The above + 90th percentile

1.000

5

The above + average deviation from mean

1.000

GEOMORPHOMETRIC QUALITY SCORE Number of Variables variables

Regression coefficient (R2)

1

Flatness index

0.759

2

The above + contour bias

0.874

3

The above + pit volume

0.954

4

The above + blockiness index

1.000

The third column of Table 3.6 (R2) indicates the proportion of the score that can be described by the variables for that row. Over 99% of the variation in accuracy is described by three of the five accuracy measures. 90th percentile and average deviation from the mean provide little additional information about accuracy. Over 95% of the variation in geomorphometric quality is described by three of the four geomorphometric indices. The blockiness index duplicates much of the information provided by the other indices. This confirms the findings of the correlation analysis described above. In light of this variable redundancy, the geomorphometric and accuracy scores were recalculated using just the most useful three variables in each case.


95

3.4.5 Comparison of DEM Quality The geomorphometric and accuracy scores were combined to help compare the quality of the DEMs. It was assumed that simply adding the two scores together would give a fair representation of overall quality, because each score was derived from three indices or measures. Each DEM‟s rank position was also calculated. Scores and rankings for each DEM are shown in Table 3.7.

SD

Reliability

Accuracy Score

Contour bias

Flatness index

Pit volume

Geomorph. Score

Overall Score

DEM 10mCont 50mCont AllVert2 AllVert5 GenVert2 GenVert5 10mGrid2 10mGrid5 30mGrid2 30mGrid5 50mGrid2 50mGrid5 IDW16 IDW26 IDW36 IDW46 IDW56 IDW112 IDW212 IDW312 IDW412 IDW512 SpReg6 SpReg12 SpTen6 SpTen12

Mean

Table 3.7 DEM Quality Measures and Scores (with rankings in brackets).

29.3 74.9 14.4 35.3 38.9 35.3 46.7 16.2 58.1 0 100 74.3 47.9 40.1 35.3 32.9 31.1 26.3 28.7 29.9 29.9 29.3 26.3 27.5 19.2 29.3

1.4 82.6 9.8 4.4 7.6 4.4 2.4 7.8 100 42.6 71.9 84 10.7 6.1 4 2.9 2.7 20.5 10.4 5.6 3.7 2.9 0 0.8 0.2 1.7

100 0 33.3 33.3 66.7 33.3 33.3 33.3 66.7 66.7 33.3 33.3 66.7 66.7 66.7 66.7 33.3 0 33.3 66.7 66.7 66.7 100 100 100 100

131 (21) 157 (23) 58 (3) 73 (6) 113 (16) 73 (6) 83 (8) 57 (2) 225 (26) 109 (14) 205 (25) 192 (24) 125 (18) 113 (15) 106 (13) 103 (12) 67 (4) 47 (1) 73 (5) 102 (11) 100 (10) 99 (9) 126 (19) 128 (20) 119 (17) 131 (22)

0 8.5 64.2 99.9 36.9 99.9 29.6 23 11.8 54.2 5.3 45.2 12.3 35.1 62.1 83.7 100 3 24.5 53.9 78.4 96.7 15.8 9.1 7.4 0

2.2 0.7 58.4 43.9 57.6 43.9 15.4 68.5 1.8 100 0 30.3 15.9 16 18.9 28.5 42 3.1 3.3 7.1 18.7 34.5 3 2.3 1.9 1.2

15.3 100 6.6 1.1 10.2 1.6 1.5 0 9.1 1.7 23.8 2.2 6.4 7.4 5.1 2.9 1.5 7.7 11.7 8.7 4.9 2.6 13.5 14.1 6.3 6.4

18 (4) 109 (19) 129 (21) 145 (24) 105 (18) 146 (25) 47 (11) 92 (16) 23 (5) 156 (26) 29 (7) 78 (14) 35 (9) 59 (12) 86 (15) 115 (20) 144 (23) 14 (2) 40 (10) 70 (13) 102 (17) 134 (22) 32 (8) 26 (6) 16 (3) 8 (1)

148 (6) 267 (25) 187 (13) 218 (19) 218 (18) 219 (20) 129 (3) 149 (7) 248 (23) 265 (24) 234 (22) 269 (26) 160 (10) 171 (11) 192 (14) 218 (17) 211 (16) 61 (1) 112 (2) 172 (12) 202 (15) 233 (21) 159 (9) 154 (8) 135 (4) 139 (5)

A key feature of the scores in Table 3.7 is that DEMs that have high geomorphometric quality do not necessarily have high accuracy and vice versa. There is weak negative


96

correlation between the two scores (R = -0.38). Accuracy and geomorphometric character are important, but unrelated aspects of a DEMs quality.

The questions listed in §3.3.10 as a basis for evaluating the causes of DEM quality have been addressed by considering the figures in Table 3.7 for subsets of the 26 DEMs. Issues raised by this evaluation are presented in §3.5.2.

3.5 Discussion The following discussion addresses the key issues raised by the assessment of the quality of the 26 DEMs. This discussion consists of two parts: first, the process of assessing DEM quality is considered; second, the quality of the 26 DEMs is examined.

3.5.1 Assessing DEM Quality The results show that all three approaches to quality assessment provide useful and unique information about the quality of a DEM. The type of information provided by each approach is considered below, followed by consideration of what a comprehensive report of a DEM‟s quality should comprise.

3.5.1.1 Visual Assessment Raster rendering of the DEMs only permits a check that all major landforms are present. Orthographic views, gradient and aspect images reveal much more about the quality of the DEMs. Representation of features can be checked, the general nature of the surface can be assessed and artefacts of the source data‟s sampling pattern and the interpolation method can be identified. An orthographic view must be zoomed in on a small region for artefacts to be identified. This is not so for gradient and aspect images in which one can see artefacts across the whole DEM at once.

While visualisation only allows a subjective assessment of quality, this approach is quick and also gives the most dramatic indication of how serious interpolation artefacts can be. Visualisation is a useful starting point to a quality assessment, allowing the general nature and extent of problems to be identified and giving the opportunity to discard or improve unacceptably poor DEMs before any further investigation is undertaken. Chapter 3 – Assessing DEM Quality

97

3.5.1.2 Geomorphometric Indices The four geomorphometric indices used here provide a useful quantification of the extent of interpolation artefacts as well as a broad indication of the character of the DEM surface. Terracing and puddles are measured by the contour bias and blockiness indices respectively. The flatness index gives an indication of how smooth or angular the surface is. The pit volume identifies problems either due to the interpolation algorithm used, particularly spline interpolators, or due to insufficient source data, causing definition of narrow openings to be lost. The indices are reasonably quick to calculate and involve no data other than the DEM itself. Their quantitative nature allows comparison of two DEMs. However, the pit volume measurement would need to be standardised according to DEM extent if the quality of DEMs of different locations were to be compared. The flatness and blockiness indices would also need standardising according to the nature of the terrain. Without standardisation these indices compare the nature of the terrain rather than the quality of the DEMs.

3.5.1.3 Accuracy Measures Using a GPS survey of sample points to calculate elevation errors and estimate DEM accuracy has been shown to be a practical method of quantifying DEM quality. However, there is some degree of bias in the selection of sample points. First, the steepest, most rugged terrain is inaccessible and therefore is not represented by the sample points. Second, differential correction of the GPS data will be most successful on more exposed or open ground. It is less likely that data from the base of a steep slope or from within a glacial cirque can be corrected. Obtaining sufficiently accurate data for such locations may require two or more attempts. It is important to obtain as representative a sample of GPS measurements as possible and therefore sufficient time must be allowed for the survey.

Of the five accuracy measures used here, standard deviation, mean and reliability give a comprehensive and non-duplicative summary of the error distribution. Standard deviation, average deviation from the mean and the 90th percentile give similar information about the dispersion of the errors about the mean. Only standard deviation


98

need be used. Of all the measures, standard deviation best reflects accuracy and has the advantage of being a well known and easy to calculate value.

3.5.1.4 A Comprehensive DEM Quality Report Visualisation, measuring geomorphometric characteristics and assessing accuracy represent three different approaches to DEM quality assessment, which evaluate and communicate quality in different ways. The three approaches can be considered as different levels of assessment.

Visualisation provides the simplest and most qualitative approach. Orthographic views, gradient and aspect images provide a quick insight to how well a DEM represents the major landforms of an area, the general character of the modelled surface and the extent and type of interpolation artefacts. If DEM quality information is to be communicated to other DEM users, a quality report can contain a written statement describing any major blunders, omissions or artefacts. However, the main advantage of the visualisation approach is to give the user of a DEM an early opportunity to discard or improve an inadequate DEM.

Geomorphometric characterisation and accuracy assessment are two quantitative approaches to DEM quality assessment. The correlation between accuracy and geomorphometric scores shows that these two approaches are unrelated, identifying different aspects of a DEMs quality. Indeed the slight negative correlation indicates that the two aspects of quality are to some extent conflicting.

The geomorphometric indices provide a way of quantifying the aspects of DEM quality revealed by visualisation. Correlation and regression analysis show that information provided by the blockiness index is largely replicated in the flatness index. So, of the four indices considered here, contour bias, flatness and pit volume give a comprehensive quantification of the general character of the modelled surface and interpolation artefacts. The indices are simple to calculate and require no additional data sources. Values for these three indices are easily included in a DEM quality report and provide valuable information about the quality of a DEM.


99

The accuracy measures reported here are based on high accuracy on-the-ground measurement of elevation. This requires a greater investment of time than the other approaches to quality assessment. Ground survey gives the best assessment of accuracy, but if necessary other sources of higher accuracy data may be adequate. Such other sources include photogrammetric measurement, comparison with a more accurate DEM or use of discarded contour line vertices. However, use of these sources only allows comparison between two models of elevation rather than comparison with “true” elevation.

A number of accuracy measures are available to summarise the distribution of error values. Measures of dispersion, such as the standard deviation and RMSE, are the most widely used. As noted in §3.2.3.3, RMSE should not be used, as the assumption of zero mean is not valid. This research has shown that standard deviation provides the most information about a DEM‟s accuracy and should be used in favour of other measures of dispersion such as average deviation from the mean and percentiles. However, the standard deviation alone does not provide a full description of a DEM‟s accuracy. The mean error provides information about any systematic bias to under- or over-estimation of elevation. Day and Miller‟s (1988) reliability measure provides information about the occurrence of outliers to the distribution. Kurtosis may be a better way of describing the tails of the distribution. Also skew may provide further important information about the error distribution. However, these two momental statistics have not been used in this study. It is clear that the common practice of only giving the RMSE to describe a DEM‟s quality is inadequate. The RMSE is a poor measure of DEM accuracy when there is a non-zero mean error, and accuracy is only one aspect of a DEM‟s quality. Visualisation, by means of orthographic views, gradient and aspect images, should be strongly encouraged to make a basic initial assessment of quality before any DEM is used. In the absence of an adequate quality report a DEM user should at least calculate and evaluate the three geomorphometric indices described above, but preferably also assess accuracy using the recommended three error distribution statistics. Any individual or organisation making a DEM available for use by others should provide a quality report which describes artefacts of the DEM generation process and gives the standard deviation, mean, reliability, contour bias, flatness index and pit volume of that DEM. An Idrisi raster data file (*.rst) is accompanied by a documentation file (*.rdc) that has comment,


100

lineage, consistency and completeness fields in which such quality report information could be stored (Fig. 3.22a). For ArcView raster grids, this quality information could be entered in the comments section of a theme‟s properties (Fig. 3.22b).

Fig. 3.22: Potential storage of a DEM‟s quality report: a) The notes section of Idrisi‟s raster metadata file; b) the comments section of an ArcView theme‟s properties.


101

3.5.2 The Quality of the DEMs Applying the quality assessment approaches to the 26 DEMs has allowed an investigation into the factors affecting DEM quality. This investigation focused on answering the questions listed in §3.3.10.

3.5.2.1 Data Distribution Issues 1. Which digital elevation data sampling pattern is best: contour lines, contour vertices or a regular grid? It is difficult to make a comparative judgement of the performance of the different sampling patterns in isolation from the influence of the other issues of data density, interpolation method and software package. It is fair to compare the DEM produced from the 10m interval contour lines, the DEMs produced using all the vertices of those same contour lines and the DEMs produced from a regular 10m interval grid of points. However, there is a slight discrepancy between the 40,000 points of the grids with 10m spacing and the 32,895 points of the contour vertices.

Table 3.8 lists the quality scores, with rankings in brackets, for the five DEMs and notes their sampling pattern as lines, vertices or grid. Table 3.8 Comparison of scores and ranks for different sampling patterns. DEM Accuracy Geomorph- Overall Sampling ometry pattern 10mCont 131 (21) 17 (5) 148 (6) Lines 10mGrid2 83 (8) 47 (11) 129 (3) Grid 10mGrid5 57 (2) 92 (16) 149 (7) Grid AllVert2 58 (3) 129 (21) 187 (13) Vertices AllVert5 73 (6) 145 (24) 218 (19) Vertices It is evident that high quality DEMs have been produced from the regular grid and contour lines, but using all contour vertices is not as effective. The contour line DEM is of low accuracy, but high geomorphometric quality. However, interpolation artefacts can be identified in the visualisations of this DEM (Fig. 3.11 and Fig. 3.17), which are evidently not quantified by the geomorphometric indices. This gives a good example of why it is important to follow all three approaches to quality assessment.


102

The regular grid DEMs are of high accuracy and medium geomorphometric quality. The DEMs produced from all contour vertices are of high accuracy, but poor geomorphometric quality. Inverse distance weighting appears to produce DEMs of lower geomorphometric quality than linear contour interpolation. This characteristic of the interpolation is most apparent when source data have an uneven distribution as with all the contour vertices rather than the perfectly uniform distribution of the regular grid. It is not that one data distribution is better than another, but that particular distributions work better with certain interpolators than others. An inverse distance weighting interpolator that can divide the search radius into sectors may well make a good quality DEM from all the contour vertices.

2. How does decreasing the contour interval affect quality? Two DEMs have been generated from contour line data. 10mCont was generated from contours with a 10m interval, while 50mCont was generated from contours with a 50m. The quality rankings and scores for these two DEMs are shown in Table 3.9. Table 3.9 Comparison of scores and ranks for different contour intervals. DEM Accuracy Geomorph- Overall ometry 10mCont 131 (21) 17 (5) 148 (6) 50mCont 157 (23) 109 (19) 267 (25) The 10m contour DEM outperforms the 50m contour DEM in terms of overall quality and geomorphometric quality, but the accuracy of the two DEMs is similar. The sparsity of contour data at 50m intervals evidently causes significant interpolation artefacts as revealed by the visualisations (Fig. 3.12 and 3.18) and geomorphometric score. The low accuracy of both DEMs is likely to be due to the over-simplicity of the contour line interpolation algorithm. As would be expected, increasing the contour interval does decrease quality, but interpolation from contour lines only seems preferable if a more sophisticated algorithm is not available.

3. How does reducing the number of contour vertices used influence quality? To ascertain the influence of using Douglas-Peucker line generalisation to reduce the number of vertices by approximately 50% the quality scores and rankings for the four DEMs generated with Idrisi‟s Interpol routine using the two sets of contour vertices are shown in Table 3.10.


103

Table 3.10 Comparison of scores and ranks for different numbers of contour vertices. DEM Accuracy Geomorph- Overall ometry AllVert2 58 (3) 129 (21) 187 (13) AllVert5 73 (6) 145 (24) 218 (19) GenVert2 113 (16) 105 (18) 218 (18) GenVert5 73 (6) 146 (25) 219 (20) The results show that reducing the number of contour vertices decreases the accuracy of a DEM when interpolated with an inverse distance weight of 2. However, with a weight of 5 the generalised vertices are just as accurate as all vertices. The vertices of contour lines represent an uneven spatial distribution of data and thinning the vertices in an attempt to spread the data distribution more evenly does not have to mean a reduction in DEM accuracy.

The effect of generalisation on geomorphometric character is surprising. The purpose of generalisation is to give a less biased distribution of source data and hence a less terraced, more geomorphometrically sound DEM. However, the results show that geomorphometric quality is similar for generalised and ungeneralised vertices. The figures for the individual geomorphometric indices (Table 3.3) show that contour bias is indeed reduced when interpolating with an inverse distance weight of 2 applied to the generalised vertices, but this improvement does not occur with a weight of 5. It seems that the poor quality of Idrisi‟s inverse distance weighting interpolator, discussed in §3.5.2.3, does not allow the benefits of generalising contour vertices to become evident. Other DEMs produced from generalised vertices attain better quality scores.

4. How does increasing grid spacing influence quality? Three regular grids of elevation data have been generated at 10m, 30m and 50m spacings. For each of these grids Idrisi has been used for inverse distance weighted interpolation with weights of 2 and 5. So six DEMs have been generated with which to investigate the influence of grid spacing on elevation error. The quality scores and rankings of these 6 DEMs are given in Table 3.11.


104

Table 3.11 Comparison of scores and ranks for different grid spacings. DEM Accuracy Geomorph- Overall ometry 10mGrid5 46 (7) 92 (16) 149 (7) 10mGrid2 47 (8) 47 (11) 129 (3) 30mGrid5 79 (21) 156 (26) 265 (24) 30mGrid2 112 (26) 23 (5) 248 (23) 50mGrid5 100 (24) 78 (14) 269 (26) 50mGrid2 98 (23) 29 (7) 234 (22)

It is evident that reducing the grid spacing from 10m to 30m causes a rapid decrease in accuracy and overall quality. In the mountainous terrain of this study area there is a significant loss of landform detail at the 30m grid spacing level. Further increasing the grid spacing to 50m does not cause such a loss of detail. In fact one 30m grid DEM is less accurate than the two 50m grid DEMs - a puzzling and perhaps anomalous result. The results demonstrate that higher sampling densities give DEMs with lower elevation errors. There is also evidence of a cut-off point at which a lower sampling density gives rise to a sudden loss of elevation accuracy. This point will be related to the periodicity and scale of the landforms comprising the terrain of the study area. It is evident that intervals of 10m are required to give a good representation of landforms at this scale of study. For studies over larger areas and at a lower scale wider grid spacings will be more appropriate.

There is not such a clear relationship between geomorphometric quality and grid spacing. This indicates that regular grids produce DEMs with similar surface characteristics regardless of grid spacing. However, taking into consideration both accuracy and geomorphometric character, the overall quality of the 10m grid DEMs is much higher.

3.5.2.2 Interpolation Issues 1. Which interpolation method is best: linear, inverse distance weighting or spline? This comparison of the performance of different interpolation techniques uses the DEMs generated in ArcView and the 10m contour DEM produced in Idrisi. Table 3.12 gives the scores and rankings for these DEMs.


105

Table 3.12 Comparison of scores and ranks for different interpolation methods. DEM Accuracy Geomorph- Overall ometry 10mCont 131 (21) 17 (5) 148 (6) IDW16 125 (18) 35 (9) 160 (10) IDW26 113 (15) 59 (12) 171 (11) IDW36 106 (13) 86 (15) 192 (14) IDW46 103 (12) 115 (20) 218 (17) IDW56 67 (4) 144 (23) 211 (16) IDW112 47 (1) 14 (2) 61 (1) IDW212 73 (5) 40 (10) 112 (2) IDW312 102 (11) 70 (13) 172 (12) IDW412 100 (10) 102 (17) 202 (15) IDW512 99 (9) 134 (22) 233 (21) SpReg6 126 (19) 32 (8) 159 (9) SpReg12 128 (20) 26 (6) 154 (8) SpTen6 119 (17) 16 (3) 135 (4) SpTen12 131 (22) 8 (1) 139 (5) Spline interpolation consistently produces DEMs of high geomorphometric and overall quality, but accuracy tends to be low. The 10m contour DEM shows similar characteristics. Some of the inverse distance weighting DEMs have good overall quality scores, but performance is highly variable. The interpolation artefacts revealed by visualisation must also be taken into account, particularly for the 10m contour DEM and DEMs produced with a high inverse distance weight, which causes puddling. Although DEMs of different character have been produced, it is evident that, in terms of accuracy and geomorphometric character, no interpolation method can be considered consistently better than the others.

2. How does increasing the search radius from 6 to 12 points influence quality? Spline and inverse distance weighting in ArcView provide seven pairs of DEMs for examining the influence of the search radius. The scores and rankings for these DEMs have already been given in Table 3.12.

For the spline DEMs the larger search radius consistently improves geomorphometric quality, but reduces accuracy. For inverse distance weighting, the larger search radius improves geomorphometric quality and can improve accuracy if used in conjunction with a low weighting of 1 or 2. This should be expected because the additional 6 points used with a 12 point search radius will be further from the cell being interpolated.


106

Therefore, they will have little influence on the interpolated value when using a high weight.

3. How does the weight used in inverse distance weighting influence quality? The influence of inverse distance weight can be assessed by comparing the scores and rankings of the ten inverse distance weighting DEMs in Table 3.12. With a search radius of 6, increasing the weight increases accuracy, but the reverse is true with a search radius of 12. Idrisi and ArcView manuals recommend 2 as a commonly used weight (Clark Labs, 2001; ESRI, 1998). Therefore, a lower weight would be expected to be more accurate, but this is not necessarily the case. The weight and search radius work in combination to influence accuracy.

With both search radii, increasing the weight decreases the geomorphometric quality. This is because using lower weights makes inverse distance weighting behave more like a linear interpolator, rather than creating the clearly defined puddles seen in Fig. 3.13 which lead to increased contour bias and flatness.

The reduction in geomorphometric quality with greater weights is more marked than any change to accuracy. So overall quality decreases for greater weights.

4. What difference is there between regularised splines and splines with tension in terms of quality? Regularised splines produce a more smoothly rounded surface than splines with tension. The effect of this is that splines with tension produce slightly more accurate DEMs and slightly lower pit volumes, but increases the other geomorphometric indices. The effect on overall quality is that neither regularised splines or splines with tension are consistently better. It should be noted that the scale of difference is small compared to the differences caused by the other variations to the DEM generation process described previously.

3.5.2.3 Software Issues 1. Does the quality of inverse distance weighting DEMs vary between the ArcView and Idrisi packages? Scores and rankings for the DEMs involved in this comparison are given in Table 3.13.


107

Table 3.13 Comparison of scores and ranks for Idrisi and ArcView. DEM Accuracy Geomorph- Overall ometry IDW26 113 (15) 59 (12) 171 (11) IDW56 67 (4) 144 (23) 211 (16) GenVert2 113 (16) 105 (18) 218 (18) GenVert5 73 (6) 146 (25) 219 (20) The accuracy of the comparable ArcView and Idrisi DEMs is very similar, but ArcView creates DEMs with higher geomorphometric quality and hence higher overall quality. The same interpolation method and same interpolation parameters have been used on the same data set, so the variation can only be ascribed to a difference in the inverse distance weighting algorithms used. Commercial GIS software companies tend to give little detail of the algorithms they use (Carrara et al., 1997) and this case is no exception. This illustrates the importance of finding out as much about software as is possible and testing the functionality.

3.6 Conclusions Visual assessment, geomorphometric characterisation and accuracy assessment are three approaches to assessing DEM quality. Applying these approaches to the 26 DEMs has revealed that pattern and density of source data, interpolation method, search radius and other interpolation parameters interact to determine DEM quality. Each of these factors can cause significant degradation of DEM quality. There is evidence of a trade-off between accuracy and surface form. Accurate DEMs tend to be pitted, puddled or terraced. The main findings are summarised below: increasing the density of source data significantly increases accuracy and can increase geomorphometric quality; DEMs produced using the same data and methods but with different software will not necessarily be the same. increasing the inverse distance weight increases accuracy, but decreases smoothness; inverse distance weighting produces more angular surfaces with puddles and terraces; spline interpolation produces smooth, rounded surfaces with spurious pits;


108

using splines with tension decreases smoothness, creates smaller pit volumes and increases accuracy; linear contour interpolation produces highly angular surfaces and is prone to artefacts such as ramps and channels; increasing the search radius increases smoothness, but decreases accuracy; The type of information about a DEM‟s quality that each approach reveals demonstrates that all three approaches are useful and complementary. To give a comprehensive report on a DEM‟s quality all three approaches should be pursued and the findings summarised. First, a DEM quality report should include a written statement describing the representation of landforms, the general nature of the terrain surface and the presence of artefacts from the DEM generation process as revealed by orthographic views, gradient and aspect images. Second, a DEM quality report should give figures for the degree of contour bias, if the DEM was created from contour-derived data, flatness index and pit volumes. Third, the DEM quality report should describe the accuracy of the DEM by giving the mean, standard deviation and reliability of error measurements. Applying these quality assessment approaches to the 26 DEMs has revealed that the quality of a DEM is the product of a number of factors, including pattern and density of sample data, interpolation method and parameters used and software. The quality report should also include a full description of these factors, i.e. the lineage of the DEM production process. This will help users identify potential quality limitations with the DEM and judge how appropriate the DEM is for a particular purpose. An example of such a quality report is given in Appendix 8.


109

Chapter 4: Modelling the Spatial Distribution of DEM Error Measures of DEM accuracy give an indication of one aspect of DEM quality. Accuracy measures can also be used to model the effect that DEM error will have on the results of DEM-based analyses. The accuracy measures used in the previous chapter summarise elevation errors in a DEM as a single value. This chapter is concerned with providing a more detailed description of accuracy by representing the spatial variation in error across a DEM.

4.1 Rationale, Aim and Objectives A more detailed description of DEM accuracy provides better understanding of DEM quality and the consequent uncertainty associated with using DEMs in environmental modelling applications. Spatially variable error surfaces would provide this more detailed description and also give a better representation of DEM errors for use in uncertainty modelling using techniques such as Monte Carlo simulation. Anecdotal and empirical evidence shows that DEM error is spatially variable, spatially correlated and heteroscedastic, being related to the form of the terrain. However, very little research has attempted to model this heteroscedasticity.

The aim of the research presented here is to test the hypothesis that DEM error is related to terrain characteristics and assess whether DEM error surfaces can be created by modelling this relationship.

The objectives are: To examine the relationship between DEM elevation error and terrain characteristics; To develop a model of the relationship between DEM error and terrain characteristics to produce spatially variable, spatially correlated, heteroscedastic error surfaces; To assess the quality of the error surfaces.

Chapter 4 – Modelling the Spatial Distribution of DEM Error

110

4.2 The Spatial Distribution Of DEM Error Describing elevation errors in a DEM with a single, global accuracy measure, such as standard deviation or RMSE, has advantages. The single value is relatively quick to calculate and easy to report. A single value makes comparison of DEMs a simple task. Global accuracy measures have also been used to model the influence of DEM error on uncertainty in DEM-based spatial modelling outcomes. However, a number of authors recognise that a single global accuracy measure has its limitations. Wood (1994) states that any useful study of DEM accuracy must investigate the spatial variation of error values. Kyriakidis et al. (1999) describe how a global accuracy statistic does not allow identification of areas where error is greatest and additional source data would most benefit DEM quality. Theobald (1989), Zhang and Montgomery (1994) and Weibel and Brändli (1995) all recognise that appreciating the spatial variability of accuracy is critical to environmental applications. For example, small errors in relatively flat areas will have a greater impact on surface run-off and flood modelling than in steeper areas (Burrough & McDonnell, 1998). Alternatively, in viewshed analysis, errors in higher, steeper terrain will have the greatest impact on results (Fisher, 1991).

Burrough and McDonnell (1998) state that a single RMSE value implies that error is uniform across the DEM. Several authors identify this assumption of stationarity as invalid (Kyriakidis et al., 1999). In their research, Ehlschlaeger and Shortridge (1997) and Hunter and Goodchild (1997) use a spatially uniform model of DEM error, but acknowledge that it is actually spatially variable. §4.2.1 reviews work by a number of authors who report that the magnitude of elevation error is related to characteristics of the terrain. Terrain characteristics are evidently spatially variable and therefore DEM elevation errors will also be spatially variable.

4.2.1 The Relationship between DEM Errors and Terrain It seems intuitive that certain types of terrain will be more suited to creation of accurate DEMs. Several authors make unverified statements about a potential relationship between DEM errors and terrain characteristics. Gao (1997) observes that DEM errors seem lower in less complex terrain. Hunter and Goodchild (1997) state that DEM error is probably related to slope steepness. Carrara et al. (1997) suggest that DEMs derived from stereo aerial photography could have greater errors on steep and shaded slopes. McDermid and


111

Franklin (1995) state that photogrammetrically produced DEMs will be most accurate on open flat terrain and least accurate on steep, shadowed and vegetated terrain.

There are some examples of researchers identifying and quantifying this relationship between error and terrain. Bolstad & Stowe (1994) evaluate the accuracy of elevation values for two DEMs. They find that the largest elevation errors tended to occur in the highest and lowest parts of the study area. Ehlschlaeger and Shortridge (1997) report that empirical studies have shown DEM error to be related to gradient and propose that it may also be related to other elevation derivatives. Kyriakidis et al. (1999) find that DEM error is correlated with terrain ruggedness. Guth (1992) finds DEM error to be highly correlated with gradient, aspect and satellite image reflectance value. Guth (1995) gives explanations for these relationships. In steep terrain a small adjustment to a stereo model causes large differences in recorded elevation. Aspect determines which areas are shaded at the time of aerial photograph capture. Reflectance values in satellite imagery are related to vegetation cover, which may obscure the actual ground surface. Wood (1993, 1996) shows how accuracy of a DEM can be related to gradient and aspect. He produces a regression model to predict accuracy from known gradient and aspect measurements.

4.2.2 Modelling the Distribution of DEM Errors It is widely acknowledged that DEM error is spatially variable and related to terrain characteristics. However, there has been little research attempting to model this error distribution. A number of the authors mentioned above acknowledge this spatial variation, then proceed with their research assuming a uniform DEM error distribution. Ehlschlaeger and Shortridge (1997) defend this assumption by stating “modelling elevation data uncertainty is a difficult task”. The work that has been done either models the spatial correlation of DEM errors or attempts to create an error surface as a model of the distribution of errors.

4.2.2.1 Spatial Correlation Spatial correlation describes the tendency for the value of a variable at one location to be most similar to the values at neighbouring locations and of decreasing similarity as distance increases. Simulated error surfaces used in Monte Carlo simulation (see Chapter 5) are generated from global accuracy measures. The simulated error surfaces consist of a


112

random mixture of error values. Hunter and Goodchild (1997) assert that DEM error is spatially correlated and therefore a model of DEM error should not be random, but spatially dependent. They present a spatially autoregressive error model which switches cell values until a spatially correlated error surface is produced, i.e. a surface in which the error values change gradually from one cell to the next. However, their work is based on the single global RMSE value and the spatially correlated error values only vary within a normal distribution with this global RMSE value.

There are two commonly used indices for quantifying the degree of spatial correlation within a dataset: Geary‟s c and Moran‟s I. Monckton (1994) reports on his use of Moran‟s I to simulate the spatial structure of elevation error. This index measures the similarity of values at specific distances (lags). A model of the change in spatial correlation with distance can be built up by calculating the index for a number of distances. Monckton finds no evidence of spatial autocorrelation in the error distribution of his study. However, the results can be considered inconclusive as the sparsity of sample points used, spot heights on Ordnance Survey maps, only allowed examination over lags of 250m or greater. There may be spatial autocorrelation of elevation error at shorter lags than those he used.

Research by Giles & Franklin (1996) used semi-variance analysis to evaluate the periodicity of error in the form of random noise. This allowed an optimum sized filter to be determined so as to remove the random noise, leaving only the spatially correlated portion of the error. However, the research did not proceed to investigating the nature of the remaining error, either in terms of the level of spatial autocorrelation or the magnitude of error associated with particular terrain characteristics.

These three examples of modelling the spatial correlation of error produce error surfaces that are spatially variable and spatially correlated, but homoscedastic. This means error values vary, but not in relation to any other variable. The apparent relationship between error and terrain means that a DEM error surface should be heteroscedastic.

4.2.2.2 Spatially Distributed DEM Error Models Burrough and McDonnell (1998) favour the use of Kriging interpolation techniques when creating a surface from point data, because a second surface is generated which represents


113

the predicted accuracy of the interpolated values as spatially distributed standard deviation values. A linear spline fitting interpolation technique described by Wood (1994) similarly produces an RMSE surface quantifying the accuracy of the interpolated values. These surfaces are distributed error models, but they only describe uncertainty in the interpolation estimates and therefore show RMSE or standard deviation increasing with increasing distance from the original data points. They do not take into account the accuracy of the source data or the relationship of error with terrain character. Kyriakidis et al. (1999) find a strong correlation between error and terrain ruggedness (ρ = 0.64), which they quantify using the standard deviation of elevation values within a 3 x 3 window of each cell. They then proceed to create higher accuracy DEMs by applying cokriging to the original DEM, basing new elevation values on distance from high accuracy height measurements and the standard deviation of elevation values. This co-kriging is performed in a multi-Gaussian framework, which means that a user-specified number of realisations of this higher accuracy DEM are created. The standard deviation in elevation values at each cell location across the set of multiple DEM realisations gives an estimate of the accuracy of the higher accuracy DEMs in the form of an error surface. Potential limitations to this approach can be identified. First, the error they calculate and use is the difference between a USGS DEM with a resolution of one degree and another USGS DEM with a resolution of 7.5 minutes. They are in effect analysing the difference between two models of the terrain surface rather than modelling the accuracy of a DEM in relation to the actual on-the-ground elevation. Second, the method is based on the strong correlation between error and what they term terrain ruggedness. This correlation may be a property of the relationship between the two DEMs rather than a terrain – error relationship. This would have to be verified by repeating the technique using different DEMs and a non-DEM source of higher accuracy elevation measurements. Third, they quantify terrain ruggedness as the standard deviation of elevation, which means that it is actually a measure of relative relief (Evans, 1972). Their method does not take any account of the reported relationship between error and other terrain characteristics, such as aspect and gradient, although relative relief is related to gradient. Nonetheless, their research is the only example of the creation of a heteroscedastic, spatially correlated error surface and the approach used is worthy of further investigation.


114

4.3 Methodology 4.3.1 Study Areas The primary study area for this research is the area of Snowdonia used for the research presented in Chapter 3 (Fig. 3.6). Key stages of the research have been reapplied to a 23.5km x 18.1km region of Mestersvig, northeast Greenland (Fig. 4.1; Appendix 3). The Greenland study area is used to validate that the methodology can be usefully applied to another mountain region using to a different scale of source data and different resolution DEMs, and that the success of the Snowdon work was not coincidental. The Mestersvig area is largely snow and ice free during summer, although there are a few small permanent patches of nevée on higher ground and steep, north-facing slopes. There is no woodland cover. The aerial photography used to derive the digital elevation data (§4.3.2.1) has not been inspected. Shadowing may obscure some of the terrain, but snow, cloud and vegetation cover are unlikely to have a significant effect on data quality.


115

Fig. 4.1 Location of the Mestersvig study area.


116

4.3.2 Data 4.3.2.1 DEMs Three of the 26 DEMs created in the research presented in Chapter 3 have been used. These are: The DEM with the highest overall quality score: IDW112 – created using ArcView‟s inverse distance weighting interpolation of the generalised contour vertices with a weight of 1 and a 12 point search radius; The DEM created using spline interpolation with the best overall quality score: SpTen12 – created using spline with tension interpolation of the generalised contour vertices with a 12 point search radius; The DEM created using ArcView‟s inverse distance weighted interpolation with the lowest overall quality score: IDW512 - created using ArcView‟s inverse distance weighting interpolation of the generalised contour vertices with a weight of 5 and a 12 point search radius.

These DEMs were chosen because they represent a range in quality, but only contain a relatively even distribution of artefacts, such as puddling throughout the study area, rather than more randomly distributed artefacts such as the ramps in DEMs created by linear interpolation of contour lines.

For the Mestersvig site 1:15,000 scale Mylar contour maps derived from aerial photography were the only available source of elevation data. These maps contained contour lines at a 10m vertical interval. In the low-lying coastal zone all contours from 0m to 100m above sea level and the 120m contour were manually digitised. In the steeper more mountainous areas further inland every fifth contour was manually digitised. This digitising strategy gave the best possible definition of the low lying areas, while avoiding excessive effort digitising the uplands (Fig. 4.2). Spot heights, mainly located on summits, were also digitised. The digitised contours were generalised using the Douglas-Peucker line generalisation algorithm to reduce the number of vertices by about 50%. The contour vertices were converted to a point data set and merged with the spot height data. Three DEMs with a resolution of 10m were generated using the techniques used for the three Snowdonia DEMs.


117

Fig. 4.2 Digitised contours for the Mestersvig site.

4.3.2.2 Measurements of DEM Error For Snowdonia, the DEM error measurements described in Chapter 3 were used. For the Mestersvig site, the same methodology as presented in Chapter 3 (§3.3.8.1) was used to collect GPS measurements of elevation at 103 sample points and calculate DEM error.

4.3.4 Deriving Terrain Parameters In order to examine and model the relationship between DEM error and terrain form, a set of terrain parameters were derived from the DEMs, which gave a comprehensive description of terrain form. Evans‟ (1972) five geomorphometric parameters that represent a comprehensive and non-duplicative quantification of surface form (elevation, gradient, aspect, plan curvature and profile curvature; see §2.1.4.1) were derived for each DEM. In addition, six other terrain parameters were derived which quantify other characteristics of terrain: surface heterogeneity (overall curvature, relative relief and texture) and terrain position (mean, minimum and maximum extremity). The distance from a cell to the nearest contour vertex was also calculated, as it would be expected that error is higher for locations that are a greater distance from the source data. Table 4.1 describes the twelve parameters.


118

Table 4.1 Terrain parameters. Parameter

Description

Elevation Gradient

The first derivative of elevation, also known as slope angle, representing the maximum rate of change of elevation.

Aspect

The compass direction of the maximum rate of change in elevation.

Plan Curvature

The horizontal component of the second derivative of elevation representing the rate of change of aspect.

Profile Curvature

The vertical component of the second derivative of elevation representing the rate of change of gradient.

Overall Curvature

The second derivative of elevation representing the surface‟s degree of convexity or concavity.

Relative Relief

The range of elevation values of all grid cells within a 10-cell radius of the grid cell concerned.

Texture

The range of gradient values of all grid cells within a 10-cell radius of the grid cell concerned.

Mean Extremity

The elevation of a grid cell minus the mean elevation of all grid cells within a 10-cell radius of that grid cell. Indicates the vertical position of the grid cell relative to its neighbours.

Minimum Extremity

The elevation of a grid cell minus the lowest elevation of all grid cells within a 10-cell radius of that grid cell. A value of near zero would indicate that that grid cell is in a pit.

Maximum Extremity

The elevation of a grid cell minus the highest elevation of all grid cells within a 10-cell radius of that grid cell. A value of near zero would indicate that that grid cell is on a peak.

Vertex Distance

The distance between a grid cell and the nearest of the contour vertices from which the DEM was interpolated.

The DEM Uncertainty and Quality extension to ArcView includes a script (“DEMUncQual_TPs.ave”; Appendix 2) that automates the process of deriving terrain parameters. The script takes a DEM and point layer as input. The point layer represents the locations of the GPS sample points and must contain a column in its attribute table with the GPS elevation measurements. At each sample point, the terrain parameters are derived, extracted and stored in a new attribute table along with calculations of the error.


119

The script uses the following ArcView functionality to derive the terrain parameters. The first derivatives of elevation are calculated using ArcView‟s aGrid.Slope and aGrid.Aspect functions. These functions implement Horn‟s (1982) method of fitting a plane to a 3 x 3 window of cells and deriving gradient (ESRI, 1998). The second derivatives of elevation are calculated using ArcView‟s aGrid.Curvature function, which implements Zevenbergen and Thorne‟s (1987) method of fitting a fourth order polynomial surface to a 3 x 3 window of cells and deriving overall curvature, plan curvature and profile curvature (ESRI, 1998). Other terrain parameters are calculated using ArcView‟s aGrid.FocalStats and grid overlay functionality.

Aspect values require special consideration because they are from a circular scale of measurement in which both 0o and 360o represent the same northerly slope orientation. This circular scale could cause problems when trying to identify a relationship between DEM error and aspect and is certainly inappropriate to the manipulations described below in §4.3.6. So the “DEMUncQual_TPs.ave” script breaks aspect down into eastwest and north-south vectors, termed here as aspect vector X and aspect vector Y respectively (Equation 4.1).

Equation 4.1: Aspect Vectors ax ay

sin( cos(

0.01745) sin( 0.01745) sin(

0.01745) 0.01745)

where a x is aspect vector X, a y is aspect vector Y, is degrees aspect and is degrees gradient. The constant 0.01745 is required for converting from degrees to radians as ArcView‟s trigonometric functions only work with radians. The principle of aspect vectors can be visualised by considering a rod (length = 1 map unit) skewered into the ground at an angle normal to the terrain surface (Fig. 4.3).


120

North (+ive Y)

Aspect (α) Gradient (β)

Vector Y West (-ive X)

Vector X

East (+ive X)

South (-ive Y)

Fig. 4.3 The aspect vector concept. The thick black line is 1 map unit in length and normal to the terrain surface.

4.3.5 Initial Investigations of the Relationship Correlation coefficients were calculated to provide a first indication of any relationship between DEM error and terrain parameters. The attribute table produced by the “DEMUncQual_TPs.ave” script was imported to the SPSS statistical package, where Pearson‟s product moment correlation coefficients were calculated for each of the terrain parameters and elevation error. This was repeated for the three Snowdon DEMs and the three Mestersvig DEMs.

4.3.6 Deriving Additional Terrain Parameters Initial investigations prompted modification of the twelve terrain parameters to create additional parameters that may better characterise the relationship between DEM error and terrain form. These additional terrain parameters fall into four groups, which are described below. The modifications increased the number of terrain parameters from 12 to 225.


121

4.3.6.1 Percentage Gradient The gradient of a slope can be represented by the angle between the surface and the horizontal expressed in degrees or as rise over run expressed as a percentage. The relationship between these two methods of measurement is non-linear (Fig. 4.4). Therefore, measuring gradient as percent rise over run may show a stronger relationship with DEM error than gradient in degrees. “DEMUncQual_TPs.ave” was modified to calculate gradient and texture both as a percentage and in degrees.

90

Gradient (degrees)

80 70 60 50 40 30 20 10 0 0

100

200

300

400

500

600

Gradient (%)

Fig. 4.4 The non-linear relationship between gradient measured in percent and gradient measured in degrees.

4.3.6.2 Mean Filtering The grids representing first and second order derivatives of elevation were found to be highly variable from grid cell to grid cell. Producing grids showing the average values for a grid cell and the neighbouring grid cells within a certain radius by applying a mean filter would represent the underlying trends of the terrain parameters. Terrain characteristics are scale dependent (Wood, 1996). Mean filtering with a variety of filter window sizes would represent these terrain characteristics at a variety of spatial scales. Terrain parameters may have a stronger relationship with error at certain scales or the relationship may involve a combination of scales. To this end “DEMUncQual_TPs.ave” was modified to mean filter elevation, gradient, aspect vectors, plan curvature, profile curvature and overall curvature values using circular windows of 5, 10 and 20 cell radii.


122

Relative relief, texture, and the three extremity parameters were initially produced from filtering functions, using a circular window of 10 cell radius. In accordance with the above consideration of scale dependency, “DEMUncQual_TPs.ave” was modified to also calculate these parameters using circular windows of 5 and 20 cell radii.

4.3.6.3 Standard Deviation Filtering Evans (1972) reports that, in addition to the mean value of a parameter for an area or terrain, or in this instance a neighbourhood of grid cells, the standard deviation of values provides important information about terrain form. Therefore, “DEMUncQual_TPs.ave” was extended to also calculate the standard deviation of elevation, gradient, aspect vectors, plan curvature, profile curvature and overall curvature values within circular windows of 5, 10 and 20 cell radii.

4.3.6.4 Polynomials Correlation coefficients can only identify relationships where an increase in one variable is always associated with either an increase or a decrease of the other variable. It was suspected that non-linear relationships between terrain parameters and DEM error might exist. To investigate this possibility “DEMUncQual_TPs.ave” was further modified to calculate the squared and cubed values of each parameter at all window sizes.

4.3.7 Modelling the Error-Terrain Relationship Correlation coefficients were calculated for all terrain parameters to investigate the relationship of each to DEM error. It was unlikely that any one terrain parameter would show a strong relationship with DEM error. It was expected that a number of terrain parameters acting in combination would influence the spatial variation in DEM error. The multiple linear regression functionality of SPSS was used to identify this multivariate relationship. Although linear regression models were employed, the use of the squared and cubed terrain parameters means that the regression models were in effect polynomial.


123

4.3.7.1 Stepwise Regression Modelling Regressing all 225 variables against just over 100 GPS measurements of error is bound to produce a good fit. Attempting to maximise the fit with as few variables as possible gives more robust regression models. Stepwise regression modelling was used to identify the best combination of terrain parameters for predicting DEM error. This analysis proceeds in steps. At each step, the independent variable (terrain parameter) not currently in the equation that has the smallest probability of not being a significant variable is added to the equation, if that probability is below a user-specified threshold. Variables already in the regression equation are removed if their probability becomes larger than a user-specified threshold. The method terminates when no more variables are eligible for inclusion or removal. The probability of being a significant variable is calculated from the variable‟s F statistic. A low threshold value gives a low number of steps and a regression model with few variables. A high threshold value gives many steps and many variables. A threshold of 0.5 was used as this consistently led to a regression model with more than 20 variables.

Stepwise regression modelling was performed using different groups of terrain parameters as independent variables and elevation error as the dependent variable. High threshold values were used to ensure that the analysis comprised many steps. Different regression equations were compared by looking at the regression coefficient of the step containing 20 variables. A regression equation of 20 variables does not necessarily equate to 20 terrain parameters as some variables may be the square or cube of a parameter. This cut-off point was chosen because there appeared to be little increment in adjusted R2 values with more than 20 variables. This is most clearly demonstrated by plotting adjusted R2 against number of variables for Snowdonia‟s SpTen12 DEM (Fig. 4.5). There is an abrupt jump in adjusted R2 value between 19 and 20 variables, because three variables were removed, then four more useful variables added at this point in the regression modelling procedure.


124

1

Adjusted R Squared

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0

10

20

30

40

Number of Variables

Fig. 4.5 Snowdonia SpTen12‟s adjusted R2 values plotted against number of variables used in stepwise regression modelling.

4.3.7.2 Generating Error Surfaces The regression equations with the highest regression coefficients were used to create error surfaces. The DEM Uncertainty and Quality extension to ArcView includes an Avenue script (“DEMUncQual_esurf.ave”, Appendix 2) that automates the production of an error surface. On running this script the user is prompted for the DEM from which to derive terrain parameters. Then a series of dialogue boxes allow the user to specify a constant, the terrain parameters and their coefficients from the regression equation. The script then derives grids representing the required terrain parameters from the DEM, multiplies these grids by the corresponding coefficients and adds the grids together to produce predicted error values for the entire extent of the DEM. In addition, the script applies a mean filter to the error values, calculates the local standard deviation of error values, and calculates the standard deviation of mean filtered error values to produce three more grids. These three operations all use a circular window of 20 cell radius. The grids representing standard deviation of error values are produced to provide a spatially variable estimate of DEM accuracy that can be used in stochastic simulation (see §5.2.1.4).

4.3.8 Model Validation The use of multivariate regression of terrain parameters to model DEM error was validated in five ways. Chapter 4 – Modelling the Spatial Distribution of DEM Error

125

First, the GPS measurements were randomly split into three subsets of equal numbers of points. A regression equation was determined using two of the point subsets and the terrain parameters identified as most effective by the stepwise regression modelling. This regression equation was used to predict the elevation error for the remaining subset of points. This was performed three times so that elevation error for each subset of points was predicted. The predicted error values were then compared to the GPSmeasured error values to assess whether the regression equation could be usefully employed to predict error values for the entire DEM.

Second, assessing the characteristics of each error surface identified spurious and extreme error values and allowed examination of the overall distribution and range of error values. This involved calculating summary statistics (minimum, maximum, mean and standard deviation) and examining the frequency distribution of error surface values.

Third, the error surfaces and standard deviation of error surfaces were visually assessed by examining 2D renderings and orthographic views. This allowed a general check for reasonableness in the scale of modelled error values and their spatial distribution.

Fourth, the whole process of deriving terrain parameters, stepwise regression modelling, creating error surfaces and validating the model was repeated using the Mestersvig data.

Fifth, appropriate error surfaces were used in stochastic simulations of the influence of DEM error. This fifth approach to model validation is the subject of Chapter 5.

4.4 Results for Snowdonia 4.4.1 Correlations Table 4.2 summarises results for the correlations of error with the initial twelve terrain parameters. A table of all 12 correlations is given in Appendix 4.


126

Table 4.2 Snowdonia‟s most significant correlations and number of significant correlations of elevation error with the initial 12 terrain parameters. SpTen12 IDW512 IDW112 Coefficient and terrain 0.269 -0.212 -0.234 parameter of most significant Plan Elevation Elevation correlation curvature No. of significant correlations at 2 1 1 0.05 level The spline with tension DEM shows the strongest correlations with elevation error. The inverse distance weighted DEMs have weaker correlations and only one significant correlation. However, there are no strong or even moderately strong correlations for any of the three DEMs.

Table 4.3 summarises results for the correlations of error with all 225 terrain parameters. A table of all 225 correlations is given in Appendix 5. Table 4.3 Snowdonia‟s most significant correlations and number of significant correlations of elevation error with all 225 terrain parameters. SpTen12 IDW512 IDW112 Coefficient and terrain 0.487 0.458 -0.564 parameter of most significant Profile Texture% Plan correlation curvature curvature No. of significant correlations at 57 42 54 0.05 level The highest quality DEM (IDW112) shows the strongest correlations with terrain parameters and a high number of significant correlations. The spline with tension DEM again has the highest number of significant correlations with error. The lowest quality inverse distance weighting DEM (IDW512) can be seen to have weaker correlations with terrain parameters than the other two DEMs.

Further information is given by ordering the terrain parameters according to the strength of correlation with error and plotting against the absolute value of the correlation coefficient (Fig. 4.6).


127

Correlation with Error

Correlation Coefficient

0.6 0.5 0.4 0.3

IDW112 IDW512

0.2

SpTen12

0.1 0 0

50

100

150

200

250

Rank of Variable

Fig. 4.6 Terrain parameters ranked according to strength of correlation and plotted against correlation coefficient. Fig. 4.6 shows that a relationship between error and terrain characteristics is not as evident for the IDW512 DEM. SpTen12 appears to have the strongest relationship.

The number of moderately strong correlations indicates that there is a relationship between DEM errors and terrain characteristics. Also, no single terrain parameter gives a good indication of the amount of error, but a combination of parameters could.

4.4.2 Regression Modelling The following groups of terrain parameters were used as independent variables in the stepwise regression modelling: 1. Evans‟ 5 terrain parameters (elevation, gradient, aspect vectors, plan curvature and profile curvature) with gradient measured in degrees; 2. As for 1. plus overall curvature; 3. As for 1. plus average, maximum and minimum extremity; 4. Evans‟ 5 terrain parameters (elevation, gradient, aspect vectors, plan curvature and profile curvature) with gradient measured in percent; 5. As for 4. plus overall curvature; 6. As for 4. plus average, maximum and minimum extremity; 7. All parameters except curvature, with gradient measured in degrees; 8. All parameters with gradient measured in degrees; 9. All parameters except curvature, with gradient measured in percent; and, 10. All parameters with gradient measured in percent. Chapter 4 – Modelling the Spatial Distribution of DEM Error

128

The main output of the regression modelling is a regression coefficient (R2) for each step of the modelling. The R2 value is an estimate of the proportion of values that will be correctly predicted by applying the calculated regression equation to unknown values of Y, which in this case is elevation error at unsampled locations. The more variables that are used in the multiple regression the more likely it is that the independent variables (the terrain parameters) can be made to fit the dependent variable (elevation error) within the scope of the sample. When fitting 20 independent variables to 100+ observations it is highly likely that the R2 value is overly optimistic. The adjusted R2 value takes account of the number of variables used and gives a more realistic indication of how well the calculated regression equation will predict unknown elevation errors. It is this adjusted R2 value that should be used to judge the effectiveness of the regressions performed. The highest adjusted R2 value for each DEM is given in Table 4.4. Table 4.4 Results of regression modelling for Snowdonia. SpTen12 IDW512 2 Adjusted R 0.826 0.757 Parameter set All parameters All parameters (gradient in %) (gradient in %)

IDW112 0.827 All parameters except curvature (gradient in %)

The results suggest that elevation error in the SpTen12 and IDW112 DEMs can be modelled by the DEM‟s terrain parameters with over 80% success. The success rate is lower (75%) for the lower quality IDW512 DEM. It should be noted that the adjusted R2 value might be under-estimated due to the use of a terrain parameter and its square or cube. The terrain parameters used in each DEM‟s 20 variable regression equation are listed in Table 4.5. The parameters are ordered according to the significance of their contribution to the modelled error.


129

Table 4.5 Regression equation variables for Snowdonia (most significant first). SpTen12 IDW512 IDW112 where: Z = elevation CH_AV53 Z_SD203 CH_AV20 GP = gradient measured in percent GP_AV203 MinEx10 CV_AV202 Ax = aspect vector X CH_AV203 CH_SD20 MaxEx203 Ay = aspect vector Y C_AV5 GP3 Ay_SD203 CH = plan (horizontal) curvature C_SD5 GP_AV20 Z_AV203 CV = profile (vertical) curvature AvEx52 CH MinEx103 C = overall curvature CH Ay_AV103 CV_AV10 AvEx = average extremity CV_AV20 Ay_SD5 Ax_SD103 MaxEx = maximum extremity CV_SD20 Ay_AV202 AvEx102 MinEx = minimum extremity Ax_SD103 Ax_AV20 CV TextP = texture measured in Z_AV10 Ay_SD103 Ax_SD203 percent C_AV203 GP_SD20 Z_AV10 AV = neighbourhood average CH_AV20 CV_SD53 TextP53 SD = neighbourhood standard TextP10 CH_SD52 CH_SD52 deviation C_SD102 C3 TextP20 5, 10 and 20 indicate the radius of CV_SD10 CV TextP52 neighbourhood parameters in CH_SD102 MinEx202 TextP203 number of cells 2 and 3 indicate the square and Ax_SD202 CH2 CV_AV5 cube of a parameter respectively CV2 Ax_AV5 Ax_SD102 Ax_SD52 Ax_AV52 Ax_SD10 For all three DEMs, the regression equation includes elevation, aspect vector, plan curvature, profile curvature and extremity parameters. Gradient and texture are each used in two of the three DEMs. Relative relief and the distance to the nearest contour vertex are not used in any of the regression equations. All three equations include neighbourhood parameters measured over all three radii of 5, 10 and 20 cells and also include squared and cubed variables. SpTen12‟s equation is dominated by curvature parameters (13 out of 20 terms). IDW512‟s equation is dominated by curvature (7 terms) and aspect vector (7 terms) parameters. IDW112‟s equation mainly comprises curvature (6 terms), aspect vector (5 terms) and texture (4 terms) parameters.

The three regression equations have been used to create error surfaces and standard deviation of error surfaces for each DEM. These surfaces are described in the following section.


130

4.4.3 Model Validation 4.4.3.1 Predicted Errors Values for the minimum, maximum, range, mean and standard deviation of errors at the 106 sample points are given in Table 4.6. The table shows the actual errors and the corrected errors (actual minus predicted error). Table 4.6 Distribution of actual and corrected errors for Snowdonia. SpTen12 Minimum Maximum Range Mean Actual errors Corrected errors Corrected as % of actual IDW512

-4.31 -7.51

12.24 25.18

16.55 32.70

-2.60 0.08

Standard deviation 3.78 3.58

174%

206%

196%

3%

95%

Minimum

Maximum

Range

Mean

Actual errors Corrected errors Corrected as % of actual IDW112

-4.98 -26.14

10.43 7.61

15.41 33.75

-2.77 -0.52


525%

73%

219%

19%

99%

Minimum

Maximum

Range

Mean

15.33 6.71

21.52 17.85

-2.72 -0.14


44%

83%

5%

63%

Actual errors -6.19 Corrected -11.14 errors Corrected as 180% % of actual

The regression modelling can be considered successful when the corrected values are less than 100% of the actual values. Clearly this does not always occur for all summary statistics. All minimum corrected errors and SpTen12‟s maximum corrected error exceed the actual errors. However, the range of corrected errors is lower than the actual range of errors for the IDW112 DEM. Additionally, the standard deviation of corrected errors is lower than the standard deviation of actual errors for all three DEMs. This indicates that there are a low number of extreme corrected errors, but overall the corrected errors are less widely dispersed than the actual errors. Nonetheless, it is only with IDW112‟s regression modelling that the spread of error values has been significantly reduced. For all three DEMs, the mean corrected error is near zero indicating that the regression modelling successfully removes systematic bias.


131

Overall, Table 4.6 suggests that the regression modelling is most successful for IDW112, the highest quality DEM. There is little evidence of modelling the error of the other two DEMs being useful.

4.4.3.2 Error Surface Characteristics Summary statistics for the three error surfaces are shown in Table 4.7 with corresponding figures for the 106 GPS sample points. The minimum and maximum values are clearly extreme. A mean filter of circular 20 cell radius was applied to the error surfaces to reduce this occurrence of occasional extreme values. Summary statistics for these mean error surfaces are also given in Table 4.7. Table 4.7 Summary statistics for error surfaces of Snowdonia. SpTen12 Minimum Maximum GPS sample -20.03 12.24 Error -2872 2468 Mean error -210 37 IDW512 Minimum Maximum GPS sample -4.98 10.43 Error -12815 15010 Mean error -20.77 208.80 IDW112 Minimum Maximum GPS sample -6.19 15.33 Error -194 389 Mean error -17 66

surfaces and mean error Mean -2.60 -3.23 -3.22 Mean -2.77 -1.45 -1.46 Mean -2.72 -2.26 -2.26

SD 3.78 22.59 7.94 SD 3.96 39.60 7.42 SD 5.11 7.65 3.96

The mean values for the IDW112 surfaces are slightly higher than the GPS sample mean. For SpTen12 the mean values of the surfaces are too low and for IDW512 they are too high. The mean filter drastically reduces the maxima and minima and consequently brings standard deviation closer to the GPS sample standard deviation. However, unrealistically large maximum or minimum errors remain except for the IDW112 mean error surface.

The summary statistics indicate that prediction of actual errors by means of regression modelling is only partially successful. There is uncertainty about the accuracy of the error surface. A spatially variable estimate of DEM accuracy could be more appropriate than predicting actual error on a cell by cell basis. This accuracy estimate could be a surface representing the standard deviation of estimated error for a cell and its


132

neighbours. The neighbourhood calculations involved in producing such a surface could cancel out the extreme values. Also, an estimate of accuracy is required as input to modelling DEM uncertainty (see Chapter 5), whereas a prediction of actual error only allows a corrected DEM to be produced.

The standard deviation of error values within a 20 cell circular radius of each cell was calculated from both the error surface and the mean error surface for each DEM. Summary statistics for these accuracy surfaces are shown in Table 4.8. Table 4.8 Summary statistics for accuracy surfaces of Snowdonia. SpTen12 Minimum Maximum Mean SD GPS sample -20.03 12.24 -2.60 3.78 SD of error 0.03 450.29 5.03 20.04 SD of mean error 0.01 33.10 0.72 2.12 IDW512 Minimum Maximum Mean SD GPS sample -4.98 10.43 -2.77 3.96 SD of error 0.00 941.35 9.22 37.09 SD of mean error 0.01 35.97 0.79 1.92 IDW112 Minimum Maximum Mean SD GPS sample -6.19 15.33 -2.72 5.11 SD of error 0.02 71.11 2.63 3.84 SD of mean error 0.03 10.49 0.54 0.71 The summary statistics of a high quality accuracy surface would show a minimum value close to zero, a low maximum value and a mean value similar to the GPS sample‟s standard deviation. Of the standard deviation of error surfaces only IDW112‟s appears reasonable. The other two surfaces have high maximum and mean values. The standard deviation of mean error surfaces have much lower maximum values. However, the mean values are also reduced to below the GPS sample‟s standard deviation. This indicates that accuracy is generally underestimated by these surfaces.

These accuracy surfaces were further assessed by examining cumulative frequency distributions. The distribution of values for the standard deviation of error surfaces are shown in Fig. 4.7.


133

100

95

% of cells

90 SpTen12 IDW512 IDW112 85

80

105

95

85

75

65

55

45

35

25

15

5

75 Standard deviation

Fig. 4.7 Cumulative frequency distribution for standard deviation of error surfaces. The quality of IDW112‟s standard deviation of error surface revealed by the summary statistics is confirmed by the cumulative frequency graph. 92% of cells have a standard deviation less than 5m, 97% of cells have a standard deviation of less than 10m and there is a short tail to the distribution. Both SpTen12‟s and IDW512‟s standard deviation of error surfaces have much longer tails (the graph is curtailed at 105m – the actual tails extend further than this). 86% of SpTen12‟s cells have a standard deviation less than 5m, but for IDW512 the figure is only 76%. IDW112‟s standard deviation of error surface is clearly the highest quality, but SpTen12‟s may also be useful despite its high maximum and long tail. IDW512‟s standard deviation of error surface does not seem to be a sufficiently good model of the DEM‟s accuracy.


134

4.4.3.3 Visual Assessment Orthographic views of the standard deviation of error surfaces are shown in Fig. 4.8. Note that a geometric scale is used for the rendering. a)

b)

c)

d)

e)

f)

Fig. 4.8 Standard deviation (SD) surfaces draped over orthographic views of the corresponding Snowdonia DEM: a) SpTen12 SD of error; b) SpTen12 SD of mean error; c) IDW512 SD of error; d) IDW512 SD of mean error; e) IDW112 SD of error; f) IDW112 SD of mean error. The orthographic views clearly show that standard deviation of mean error removes the more extreme values and produces a rather spatially invariable accuracy surface. All three DEMs have similar distributions of standard deviation of error. The lowest accuracy is found on the central parts of the steepest slopes. Gentle and evenly sloping terrain has the highest accuracy. Also, the findings of the assessment of frequency distributions are visually portrayed. SpTen12 has the highest proportion of cells with a standard deviation less than 1m (brown areas in Fig. 4.8a), but also a high number of


135

more extreme values (dark green and blue in Fig. 4.8a). IDW112 shows a clear variation in the distribution of cells with a standard deviation of less than 4m, no extreme values and only a few cells with a standard deviation of above 32m.

4.5 Results for Mestersvig 4.5.1 Correlations Table 4.9 summarises results for the correlations of error with the initial twelve terrain parameters. A table of all 12 correlations is given in Appendix 6. Table 4.9 Mestersvig‟s most significant correlations and number of significant correlations of elevation error with the initial 12 terrain parameters. SpTen12 IDW512 IDW112 Coefficient and terrain -0.469 -0.480 -0.445 parameter of most significant Elevation Elevation Elevation correlation No. of significant correlations at 6 8 5 0.05 level All three DEMs show moderately strong negative correlations with elevation. There are stronger and a greater number of significant correlations than for the Snowdonia DEMs.

Table 4.10 summarises results for the correlations of error with all 225 terrain parameters. A table of all 225 correlations is given in Appendix 7. Table 4.10 Mestersvig‟s most significant correlations and number of significant correlations of elevation error with all 225 terrain parameters. SpTen12 IDW512 IDW112 Coefficient and terrain 0.629 -0.597 0.568 parameter of most significant Aspect Elevation Elevation correlation vector X No. of significant correlations at 103 125 92 0.05 level In contrast to the Snowdonia DEMs, IDW512 has the greatest number of significant correlations. All three DEMs have similar strongest correlation coefficients.

In Fig. 4.9 terrain parameters ordered according to the strength of correlation with error are plotted against the absolute value of the correlation coefficient.


136

Correlation with Error

Correlation Coefficient

0.7 0.6 0.5 st12

0.4

idw112

0.3

idw512

0.2 0.1 0 0

50

100

150

200

250

Rank of Variable

Fig. 4.9 Terrain parameters ranked according to strength of correlation and plotted against correlation coefficient. Fig. 4.9 shows that the IDW512 DEM has a higher number of slight correlations, but all three DEMs show similar numbers of moderately strong correlations. This contrasts with the Snowdonia data where the lowest quality DEM (IDW512) had fewer and less strong correlations. It should be noted that because the DEM quality assessment has not been undertaken for the Mestersvig DEMs, it is not known whether IDW512 is in this instance a lower quality DEM or not.

As for Snowdonia, the number of moderately strong correlations indicates that there is a relationship between DEM errors and terrain characteristics. The Mestersvig correlation coefficients are generally higher than for Snowdonia and there are a greater number of significant correlations, but again no single terrain parameter gives a good indication of the amount of error.

4.5.2 Regression Modelling The stepwise regression techniques were reapplied to the Mestersvig data. The highest adjusted R2 value for each DEM is given in Table 4.11.


137

Table 4.11 Results of regression modelling for Mestersvig. SpTen12 IDW512 2 Adjusted R 0.902 0.910 Parameter set All parameters All parameters (gradient in %) except curvature (gradient in degrees)

IDW112 0.887 All parameters (gradient in degrees)

Adjusted R2 values for all three DEMs are higher than those for the Snowdonia DEMs, suggesting that elevation error can be modelled by the DEM‟s terrain parameters with a high degree of success. For the two inverse distance weighted DEMs gradient is best measured in degrees whereas for the Snowdonia DEMs it is best to measure gradient in percent. The terrain parameters used in each DEM‟s 20 variable regression equation are listed in Table 4.12. The parameters are ordered according to the significance of their contribution to the modelled error. Table 4.12 Regression equation variables for Mestersvig (most significant first). SpTen12 IDW512 IDW112 where: Z = elevation Ax_AV203 Z_AV203 Ax_AV203 Gd = gradient measured in degrees Z_AV202 Ax_AV203 AvEx53 Ax = aspect vector X Ax_SD103 Ch_AV103 Z_AV102 Ay = aspect vector Y MinEx103 MinEx102 Rel20 Ch = plan (horizontal) curvature MaxEx10 Cv_AV52 Ay_AV53 Cv = profile (vertical) curvature Ay_SD20 Ch_SD203 Cv3 C = overall curvature Ch_AV52 Z3 C_AV202 AvEx = average extremity MinEx203 Text102 Ch_SD10 MaxEx = maximum extremity Rel20 Ay_AV103 Z_SD20 MinEx = minimum extremity MinEx202 Z_AV102 MaxEx20 Rel = relative relief Ay_SD53 MinEx20 Ax_SD20 Text = texture measured in AvEx53 Gd_SD103 Gd_AV20 degrees Cv Z Ch_AV52 TextP = texture measured in Z_AV203 Gd_AV103 MaxEx203 percent TextP52 Ch_AV10 Cv_AV52 AV = neighbourhood average Cv_AV10 Cv_AV20 Z_SD102 SD = neighbourhood standard Z_AV20 Cv_SD203 Ax_SD5 deviation TextP20 Ay_SD20 Ay_AV102 5, 10 and 20 indicate the radius of neighbourhood parameters in AvEx52 Cv_AV10 Ay_AV52 number of cells Ch_SD203 Ay_SD53 AvEx102 2 and 3 indicate the square and cube of a parameter respectively The variety of parameters used in the Mestersvig regression equations is similar to that for Snowdonia. For all three DEMs, the regression equation includes elevation, aspect vector, plan curvature, profile curvature and extremity parameters. Gradient and texture


138

are used in two of the three DEMs. Relative relief (the range of elevation values), which was not used in any of the Snowdonia regression equations, is also used in two regression equations. The distance to the nearest contour vertex is again not used in any of the regression equations. All three equations include neighbourhood parameters measured over all three radii of 5, 10 and 20 cells and also include squared and cubed variables.

There is less domination of the regression equations by a particular type of parameter than for Snowdonia. Elevation, aspect vectors, curvature and extremity parameters are commonly used in all three equations. Texture, relative relief and gradient are less frequent.

4.5.3 Model Validation 4.5.3.1 Predicted Errors Values for the minimum, maximum, range, mean and standard deviation of errors at the 103 sample points are given in Table 4.13. The table shows the actual errors and the corrected errors (actual minus predicted error).


139

Table 4.13 Distribution of actual and corrected errors for Mestersvig. SpTen12 Minimum Maximum Range Mean Actual errors Corrected errors Corrected as % of actual IDW512

-101.02 -42.25

118.32 59.56

219.34 101.81

4.90 0.58


46%

50%

46%

12%

47%

Minimum

Maximum

Range

Mean

Actual errors Corrected errors Corrected as % of actual IDW112

-112.98 -40.83

110.66 29.06

223.63 69.89

4.68 0.73


36%

26%

31%

16%

37%

Minimum

Maximum

Range

Mean

Actual errors Corrected errors Corrected as % of actual

-99.95 -82.36

118.65 207.31

218.60 289.67

6.83 2.91


82%

175%

133%

43%

84%

The prediction success for the Mestersvig DEMs is significantly better than for the Snowdonia DEMs. For the SpTen12 and IDW512 DEMs both the mean error and the dispersion of error values are much reduced. The regression equation for IDW112 performs less well with an increase in maximum error and only a slight decrease in the standard deviation. In Snowdonia the correction statistics for this DEM were better than the other two. Overall, Table 4.13 suggests that the regression modelling is most successful for IDW512, but also useful for SpTen12. There is little evidence that modelling IDW112‟s error is beneficial.

4.5.3.2 Error Surface Characteristics Summary statistics for the three Mestersvig error and mean error surfaces are shown in Table 4.14 with corresponding figures for the 103 GPS sample points.


140

Table 4.14 Summary statistics for error surfaces of Mestersvig. SpTen12 Minimum Maximum GPS sample -101.02 118.32 Error -166350 56915 Mean error -214 2057 IDW512 Minimum Maximum GPS sample -112.98 110.66 Error -6835 1403 Mean error -2724 218 IDW112 Minimum Maximum GPS sample -99.95 118.65 Error -18269 27780 Mean error -311 1120

surfaces and mean error Mean 4.90 -3.95 -3.76 Mean 4.68 -5.76 -5.69 Mean 6.83 -22.97 -22.99

SD 32.41 244.56 56.77 SD 33.83 76.50 53.16 SD 32.86 104.81 52.90

Despite the differences in prediction success for the three DEMs the summary statistics for the three error and mean error surfaces are similar. For all surfaces the mean values are noticeably lower than those of the GPS samples. This could be because the distribution of GPS sample points on different terrain types does not match the distribution of terrain types for the whole DEM area. It does not necessarily mean that the error surfaces are incorrect. However, the mean values for the IDW112 surfaces are particularly low and likely to be due to the poorer regression modelling results. As for Snowdonia, the mean filter drastically reduces the maxima and minima and consequently brings standard deviation closer to the GPS sample standard deviation. However, unrealistically extreme errors remain for all DEMs.

As for Snowdonia, the standard deviation of error values within a 20 cell circular radius of each cell was calculated from both the error surface and the mean error surface for each DEM. Summary statistics for these accuracy surfaces are shown in Table 4.15.


141

Table 4.15 Summary statistics for accuracy surfaces of Mestersvig. SpTen12 Minimum Maximum Mean SD GPS sample -101.02 118.32 4.90 32.41 SD of error 0.01 8823 32.39 181.41 SD of mean error 0.01 276 4.84 9.51 IDW512 Minimum Maximum Mean SD GPS sample -112.98 110.66 4.68 33.83 SD of error 0.01 1732 23.03 39.56 SD of mean error 0.02 316.48 5.17 7.77 IDW112 Minimum Maximum Mean SD GPS sample -99.95 118.65 6.83 32.86 SD of error 0.01 3215 26.42 78.07 SD of mean error 0.05 178 4.09 4.81 Of the standard deviation of error surfaces only IDW512‟s appears at all reasonable. Although all three have mean values close to the standard deviation of the corresponding GPS sample, the other two surfaces have high maximum and standard deviation values. The standard deviation of mean error surfaces have similar characteristics to those for Snowdonia with much lower maximum values, but also mean values reduced to below the GPS sample‟s standard deviation. This means that generally accuracy is underestimated by these surfaces.

The distributions of values for the standard deviation of error surfaces are shown in Fig. 4.10.


142

Cumulative frequency distribution for standard deviation of error 100

95

% of cells

90 st12 esd idw512 esd idw112 esd 85

80

210

190

170

150

130

110

90

70

50

30

10

75 Standard deviation

Fig. 4.10 Cumulative frequency distribution for standard deviation of error surfaces. The quality of IDW512‟s standard deviation of error surface revealed by the summary statistics is confirmed by the cumulative frequency graph. 95% of cells have a standard deviation less than 50m, over 99% of cells have a standard deviation of less than 100m and there is a short tail to the distribution. Both SpTen12‟s and IDW112‟s standard deviation of error surfaces have longer tails (the graph is curtailed at 210m – the actual tails extend further than this). IDW512‟s standard deviation of error surface is the highest quality, followed by SpTen12‟s. The IDW112 surface does not seem to be a sufficiently good model of the DEM‟s accuracy.


143

4.5.3.3 Visual Assessment Orthographic views of the standard deviation of error surfaces are shown in Fig. 4.11. Note that a geometric scale is used for the rendering. a)

b)

c)

d)

e)

f)

Fig. 4.11 Standard deviation (SD) surfaces draped over orthographic views of the corresponding Mestersvig DEM: a) SpTen12 SD of error; b) SpTen12 SD of mean error; c) IDW512 SD of error; d) IDW512 SD of mean error; e) IDW112 SD of error; f) IDW112 SD of mean error. The orthographic views clearly show that standard deviation of mean error removes the more extreme values and gives lower standard deviation values. There are differences in the distributions of standard deviation of error, most noticeably in the lower coastal areas. Lowest accuracy is found along ridge tops and peaks, unlike Snowdonia where lowest accuracy is found on the central parts of the steepest slopes. Accuracy is also lower on southeast facing slopes. As suggested by McDermid and Franklin (1995), this could be due to these slopes being in shadow at the time of the aerial photography from


144

which the original contour line data were created. There is a clear contrast between the accuracy of the flatter coastal areas and the more mountainous interior. The lower frequency of extreme values in the IDW512 standard deviation of error surface can be seen. For all three surfaces, some of the low number of extreme accuracy values are found at the edge of the study area. These extreme values are probably caused by poor interpolation where there is an inadequate distribution of contour vertices and edge effects influencing the derivation of terrain parameters.

4.6 Discussion This section considers the hypothesis that DEM error is related to terrain characteristics, examines the quality of the DEM error and accuracy surfaces and discusses issues affecting the quality of these surfaces.

4.6.1 The Relationship between DEM Error and Terrain Character The research presented in this chapter is based on proving the hypothesis that the spatial variation in a DEM‟s error is related to characteristics of the terrain. This hypothesis has been developed in response to the anecdotal and empirical evidence of authors such as Guth (1992), Bolstad and Stowe (1994), Gao (1997) and Hunter and Goodchild (1997). For both study areas and all six DEMs coefficients of up to approximately ρ = 0.5 have been identified for the correlation between DEM error and a number of terrain parameters. These values are lower than the ρ = 0.64 that Kyriakidis et al. (1999) computed for the correlation between error and standard deviation of elevation. Kyriakidis et al. (1999) consider error as the difference between two DEMs rather than the difference between a DEM and the elevation of the real terrain. This is likely to cause the lower correlation coefficients found in this study. The lower correlation coefficients and the number of significant correlations indicates that DEM error is related to terrain character, but this relationship is best quantified by multivariate modelling of a number of terrain parameters. The high adjusted regression coefficients of approximately 0.9 and predicted error statistics substantiate this.

4.6.2 The Quality of Error and Accuracy Surfaces A new method for creating spatially variable, spatially correlated and heteroscedastic error surfaces has been developed and successfully applied to DEMs of the Snowdonia


145

and Mestersvig study areas. However, the quality of these error surfaces is variable. Although the average value for a whole error surface is usually reasonable, there are extreme maximum and minimum values and figures for standard deviation of an error surface indicate that there is an unrealistically wide dispersion about the mean. Applying a 20 cell radius mean filter does reduce the standard deviation and reduces, but does not remove, the extreme values. A filter of sufficiently large window size would remove these extreme values. Larger filter sizes (up to 50 cell radius) were used. However, extreme values still remained and excessive smoothing of non-extreme values was removing local variation in error values. Due to these characteristics of the error surfaces they are not suited to correcting a DEM.

Although they are of limited quality, the error surfaces can be used to generate standard deviation of error surfaces. These accuracy surfaces estimate error characteristics for neighbourhoods of cells and can be used to model the degree of uncertainty in subsequent results of DEM-based analyses. The more anomalous error values are subdued by the generalisation involved in summarising error values for local 20 cell windows.

For both Snowdonia and Mestersvig at least one of the accuracy surfaces can be considered of sufficient quality for further use. The results for prediction success reflect the quality of the subsequently produced accuracy surfaces. The prediction success can be used to identify which accuracy surface to create. The technique has been demonstrated to work for two areas of mountain terrain, modelled at different resolutions using different types of source data. The indications are that the technique is broadly applicable. However, there are a number of issues to consider, which are discussed below.

4.6.3 GPS Sample Point Issues The error surface technique involves estimating errors at over a million grid cells from approximately 100 GPS measurements. The number and location of these sample points will influence the quality of the accuracy surfaces. An appropriate number of GPS measurements was determined by applying Li‟s (1991) equations (§3.1.3.1). According to Li‟s (1991) equations 100 GPS sample points should


146

give a 95% reliable estimate of a DEM‟s accuracy. However, this relates to the reliability of a global accuracy estimate rather than local estimates. Further research is needed to investigate the influence of the number and location of sample points on the resulting accuracy surfaces.

Time, accessibility and GPS operational issues mean that the sample points give a limited representation of the true variety of terrain characteristics found in a mountain environment. There will be terrain that is inaccessible and locations where GPS surveying will not be successful due to the obstruction of the sky by the terrain. Both study areas contain steep slopes and near vertical rock faces that cannot be sampled. The coastal region of the Mestersvig study area is characterised by near flat terrain traversed by numerous heavily braided, often deep and fast flowing channels, which are largely inaccessible. Also, the ruggedness of mountain terrain means that there is a high degree of variability in terrain character over relatively short distances. An impractically large number of GPS survey points would be required to sample all varieties of terrain character. For these reasons a derived regression model will give a limited representation of the relationship between terrain character and DEM error.

4.6.4 Choice of Terrain Parameters All of the types of derived terrain parameter have been used in at least one regression model. Additional parameters that quantify the form of the terrain may be useful. Also, the values of derivatives are dependant on the algorithms used (Jones, 1998). ArcView‟s implementations of Horn‟s (1982) and Zevenbergen and Thorne‟s (1987) algorithms have been used here to derive gradient and curvature values. Other algorithms would compute different derivative values that may give a better regression model. The regression models‟ inclusion of mean and standard deviation parameters calculated using 5, 10 and 20 cell radius filter windows indicates that error is related to terrain characteristics measured at various scales. Other filter window radii may be more appropriate than those chosen for use here. Processing time becomes a limiting factor for filter windows of greater than 20 cell radius, and also when a large number of different sized filter windows are used. However, further research would be beneficial into the influence of choice of filter window size on the regression modelling results.


147

Squared and cubed terrain parameters have been derived and selected for use in the regression models. Other transformations may also be useful. However, the logarithms and exponents of Snowdon‟s SpTen12 DEM terrain parameters were briefly investigated, but no significant correlations with DEM error were found and these parameters were not selected in the stepwise regression modelling routine.

Although there are a number of ways in which terrain parameters could be further explored, it seems unlikely that the benefit to the regression modelling will be great. High adjusted R2 values of up to 0.9 have already been achieved with the terrain parameters used in this study.

4.6.5 Quality of Terrain Parameters The distribution of error has been modelled from terrain parameters derived from the DEMs, rather than from on-the-ground measurement of the true terrain parameters. The DEMs contain errors and therefore, the derived terrain parameters will be subject to error. A gradient grid will not be a true representation of the real gradient. It is not possible to measure terrain parameters in the field with a sufficient degree of accuracy and consistency. This is a prime reason for the widespread use of DEMs. Terrain parameters derived from a DEM give the best available description of terrain character. It may be that terrain parameters derived from an accurate DEM may give a better model of the distribution of error in a lower quality DEM, than terrain parameters derived from the lower quality DEM. However, this is not useful because, as described in §3.2.3.1, the DEM in question is almost always the highest quality DEM available. The DEM-derived terrain parameters only give an approximation of terrain character. Therefore, a fully accurate model of the relationship between terrain character and DEM error cannot be achieved by this method.

Many of the cells with extreme error values are found towards the boundaries of the study areas. This is due to lower quality DEM interpolation and edge effects influencing derivation of terrain parameters at these locations. To mitigate these effects, the regression modelling should use a DEM that extends more than 20 cells beyond the limits of the study area. Subsequently derived error and accuracy surfaces should then be cropped to the extent of the study area.


148

4.6.6 Differences between the Error and Accuracy Surfaces The six regression models all utilise a variety of the terrain parameter types, but they differ in the specific terrain parameters used. There is clearly no generic relationship between DEM errors and terrain character. This indicates that the nature of the terrain being modelled and the method used to generate the DEM influence the distribution of errors, the parameters that are most useful to the regression model and how well the model fits the sampled DEM errors. The correlation coefficients, adjusted R2 values and accuracy surface summary statistics indicate that the relationship between error and terrain character in the Mestersvig area is stronger than for Snowdonia and consequently a better accuracy surface can be created. The technique has worked better for the smaller scale, coarser resolution, and lower accuracy study area. DEM errors can be assumed to have a heteroscedastic element and a random element. The heteroscedastic element is related to terrain character and can be modelled from DEM-derived terrain parameters. It is likely that the random element is due to small variations in the accuracy of elevation measurements and small local variations in the elevation of the terrain, for instance individual hummocks or boulders, that cannot be captured at the DEM scale concerned. The difference between the quality of accuracy surfaces for the two study areas could be because the higher resolution, higher accuracy DEMs reduce much of the heteroscedastic component leaving random errors to dominate to a greater extent.

4.7 Conclusions The research presented in this chapter has shown that error in a DEM is spatially variable. The magnitude and distribution of errors are related to the varying character of the terrain. GPS surveys of elevation error and DEM-derived terrain parameters, which quantify terrain character, can be used in regression modelling to estimate the distribution of DEM errors in the form of an error surface. The nature of the relationship between DEM error and terrain parameters varies according to the type of terrain, the resolution of the DEM and the DEM production method.

The quality of the error surface is limited, primarily due to limitations in the size and distribution of GPS sample points and the quality of the DEM-derived terrain parameters. A standard deviation filter can be applied to the error surface to create an Chapter 4 – Modelling the Spatial Distribution of DEM Error

149

accuracy surface. The filtering process absorbs the more spurious error estimates, creating an accuracy surface that gives a more complete description of a DEM‟s accuracy. Chapter 5 investigates the use of such a spatially variable, heteroscedastic representation of DEM accuracy in modelling uncertainty in DEM-based analyses.


150

Chapter 5: Applying Knowledge of DEM Accuracy to Uncertainty Modelling The research presented in chapters 3 and 4 provides methods for giving a thorough assessment of DEM quality and for generating a detailed model of the spatially variable accuracy of a DEM. These techniques provide summary statistics and an accuracy surface, which allow one to compare the quality of different DEMs, and also to make an assessment of the uncertainty associated with the outcomes of DEM-based modelling projects. Assessing uncertainty is important because this indicates the validity of model outcomes and the degree of caution required when making decisions based on these outcomes. Assessing uncertainty also helps mitigate the potential for inappropriate use of results. The more that is known about the uncertainty of model outcomes, the better one can assess the validity or appropriate use of DEM-based modelling outcomes.

Summary statistics and accuracy surfaces provide limited and indirect information about uncertainty of model outcomes, which must be interpreted. Modelling the impact of DEM quality provides more direct information about uncertainty. Three main techniques have been used to model this uncertainty: epsilon bands; error propagation; and, stochastic simulation. A review of this previous work is presented in §5.2.1. The remainder of the chapter presents a modification of stochastic simulation techniques to utilise accuracy surfaces for modelling uncertainty. This application of a DEM‟s accuracy surface also allows further validation of the techniques for generating the accuracy surface presented in chapter 4.

5.1 Aim and Objectives The aim of the research presented in this chapter is to enhance the ability to assess uncertainty about the outcomes of DEM-based modelling applications.

The objectives are: To use a DEM‟s accuracy surface in stochastic simulation of uncertainty in modelling outcomes;

Chapter 5 – Applying Knowledge of DEM Accuracy to Uncertainty Modelling

151

To investigate the differences between stochastic simulations using an accuracy surface and those using a global accuracy measure; To assess the benefits of using an accuracy surface in stochastic simulation.

5.2 Introduction to Modelling Uncertainty As described in §2.2, DEMs contain error, are of limited quality and therefore uncertainty exists regarding what constitutes appropriate use of a DEM and the validity of DEM-based modelling outcomes. In recent years there has been increasing concern within the GIS research community about management of the quality of modelling outputs and dealing effectively with uncertainty. Hunter (1998) ascribes this increasing concern to four factors: the requirement in some application fields for data quality reports when transferring data; the need to protect the reputations of individuals or organisations when, for example, spatial data support an administrative decision which then goes to appeal; to protect against litigation if the user of spatial data suffers harm or loss; and, the basic scientific requirement to describe the quality of information. Despite the increasing concern within the research community, end users‟ awareness of error, quality and uncertainty issues remains low (Heywood, et al., 1998). There are few techniques incorporated into GIS software packages for identifying and modelling errors, or for managing the associated uncertainty (Eastman, 1999b; Heywood, et al., 1998; Miller & Morrice, 1996).

A number of authors have proposed strategies for managing uncertainty. Hunter (1998) gives five action points in his uncertainty management strategy: “develop formal, rigorous models of uncertainty; understand how uncertainty propagates through spatial processing and decision making; communicate uncertainty to different levels of users in more meaningful ways; design techniques to assess the fitness for use of geographic information and reduce uncertainty to manageable levels for any given application; and,


152

learn how to make decisions when uncertainty is present in geographic information, i.e. be able to absorb uncertainty and cope with it in our everyday lives.”

As a result of their work on modelling the spatial distribution of DEM error using geostatistics and stochastic simulation (§4.2.2.2 and §5.2.1.4), Kyriakidis et al. (1999) propose that a DEM is supplied with a map of predicted elevation errors, a map of the probability that elevation is over- or underestimated and a record of the parameters used to create simulated DEM realisations. This would allow the end user to make a decision on whether additional elevation data are needed in locations with the most extreme errors and also to undertake stochastic simulation to estimate the uncertainty of modelling outcomes. Their technique for generating the DEM error map has limitations (see §4.2.2.2 and §4.6.1). However, either this approach or an accuracy surface created using the techniques described in chapter 4, would be a useful complement to the comprehensive quality report recommended in §3.5.1.4. The proposed map of the probability that elevation is over- or underestimated is of specific use to elevation thresholding. Kyriakidis et al. (1999) give a case study in which the probability of an entire ridge lying above 160m is calculated. This type of map can only be produced from an error map. An accuracy map cannot distinguish between over- and underestimation. Also awareness, understanding and management of uncertainty are best advanced by providing generic information rather than information that is application specific. Kyriakidis et al. (1999) contribute to addressing the first three of Hunter‟s (1998) action points. They develop a model of uncertainty, apply this model to examine the uncertainty associated with one modelling scenario, and propose ways to better communicate uncertainty to others.

5.2.1 Techniques for Assessing and Managing Uncertainty The stochastic simulation approach employed by Kyriakidis et al. (1999) is one of a number of methods of potential use for assessing and managing the uncertainty associated with spatial modelling outcomes. These methods (epsilon bands, error propagation, fuzzy sets and fuzzy logic, and Monte Carlo simulation) are described below with particular attention to their use for managing uncertainty in DEM-based modelling.


153

5.2.1.1 Epsilon Bands Epsilon bands are polygons drawn around lines to represent the uncertainty in a line‟s position. This technique was first used by Chrisman (1982) and has been adopted by others such as Blakemore (1984). The technique was developed to communicate the uncertainty of a line‟s horizontal position due to the effect of digitising errors. The distance of the polygon perimeter from the digitised line is equal to the accuracy of the line‟s position. Concentric bands can be used to represent areas within 1, 2 and 3 standard deviations of the line‟s position (Fig. 5.1).

Digitised line < 1 standard deviation 1 – 2 standard deviations 2-3 standard deviations Fig. 5.1 Epsilon bands Epsilon bands have two limitations. First, the technique is only a visual way of communicating the accuracy of the digitising process. The technique gives no more information than stating the accuracy figure would. It is up to the end user to interpret how this accuracy impacts the validity of any derived results. Second, the technique is not directly applicable to DEM uncertainty. As described in §2.3.1, DEM error has horizontal and vertical components, which are indistinguishable, the result being cells with incorrect elevation values. Epsilon bands could be drawn around contour lines derived from the DEM. However, the width of these epsilon bands would be related to both standard deviation of the DEM error (either a global or locally variable accuracy measure) and locally variable gradient. Locally adjusting the width of an epsilon band to account for local variation in gradient and possibly local variation in accuracy would be computationally difficult. Examples of creating such locally variable epsilon bands have not been found in the literature and this technique is not attempted here.


154

5.2.1.2 Error Propagation When spatial data sets are analysed or modelled, the error in each input data set propagates through the analysis and combines with error in other input data sets often giving rise to output data sets with exacerbated errors (Heuvelink, 1998). Research has been undertaken to define formulae that calculate the propagated error (Eastman, 1999b; Heuvelink, 1998; Veregin, 1994). Initially these error propagation formulae were restricted to analyses involving GIS operations that are local to individual cells (Ehlschlaeger & Shortridge, 1997). Heuvelink (1998) presents methods for estimating propagated error in analyses involving spatial interaction, i.e. GIS operations applied to a neighbourhood. There are three main problems with calculating propagated error for such non-local operations (Rossiter, 1995): the formulae are impossible and/or impractical to define, or too complex to apply easily; there are problems with accounting for potential covariance of errors when two or more data sets are correlated; error propagation assumes that neighbouring cells are spatially independent, which is often not applicable. These three problems are particularly applicable to DEM-based modelling and analysis. Therefore error propagation has not been used for modelling uncertainty associated with many DEM-based applications.

5.2.1.3 Fuzzy Sets and Fuzzy Logic Fuzzy sets and fuzzy logic have been used to model uncertainty associated with nominal or ordinal data (Eastman, 1999b; Rossiter, 1995). These techniques could be of use in a DEM-based application if it involves classification of DEM-derived data into categories, such as steep, medium and gentle gradients. They can also be used for DEM-derived feature classifications such the viewshed uncertainty work by Fisher (1991, 1992, 1993, 1994). However, these techniques are not directly applicable to DEM-based modelling and are not considered in the research presented here.


155

5.2.1.4 Monte Carlo Simulation Monte Carlo simulation is the most commonly used of a number of stochastic simulation techniques. Stochastic simulation is a generalised and flexible technique for modelling uncertainty in the outputs of any spatial analysis (Openshaw et al., 1991). The technique is the only way to model uncertainty in complex non-local operations (Ehlschlaeger & Shortridge, 1997). Fisher (1991, 1992 and 1993) uses Monte Carlo simulation to assess uncertainty in results of viewshed analyses. Ehlschlaeger and Shortridge (1997) apply the technique to model uncertainty in the least cost path operation. Ehlschlaeger and Goodchild (1994) use Monte Carlo simulation to model uncertainty about the impact of sea-level rise.

In the context of DEM-based modelling, Monte Carlo simulation is based on the principle that the DEM is one of an infinite number of possible representations of the actual elevation surface (Ehlschlaeger & Shortridge, 1997). Each of these possible representations, or realisations, is equally likely to occur and the DEM is considered as a randomly chosen realisation. The full set of equally probable realisations can be defined by a probability density function (PDF) with a known mean surface and a random field function representing the potential deviations of realisations from this mean surface. The DEM is assumed to represent the mean of the PDF. The deviation of other realisations from the mean surface is assumed to have a normal distribution with a zero mean and a standard deviation equal to the standard deviation of elevation error. For each DEM cell, an alternative realisation is generated by randomly drawing a value from the normal distribution and adding the value to the DEM cell‟s elevation. These randomly chosen cell values are then processed to give a DEM realisation with spatially correlated rather than random values. This processing can be a mean filter (Burrough & McDonnell, 1998; Kyriakidis et al., 1999) or an iterative swapping of cell values until a target degree of spatial correlation is achieved (Fisher, 1991; Ehlschlaeger & Goodchild, 1994). Repeating this procedure for all cells a user-specified number of times (N) creates N surface realisations. Spatial modelling is then applied to each of these realisations to create N output data sets. The mean of these output data sets represents the most probable result. The variation in the output data sets indicates the degree of uncertainty in the results. This variation can be summarised by a grid representing the standard deviation of values at each cell for the N output data sets. Fig. 5.2 gives schematic


156

representations of a spatial modelling procedure and the equivalent Monte Carlo simulation.

Ordinary Spatial Modelling Procedure

Monte Carlo Simulation Procedure

Input Data Sets

Input Data Sets Perturb Data Sets

Spatial Modelling

Alternative Data Set Realisation Repeat N Times

Output Data Set

Spatially Correlate Spatial Modelling Store Output Data Set Calculate Summary Statistics for N Output Data Sets

Fig. 5.2 The Monte Carlo simulation procedure (adapted from Openshaw et al., 1991) Although the stochastic simulation approach to modelling uncertainty is flexible and hence widely applicable, there are three important issues. First, a large number, maybe hundreds or thousands, of simulations must be run in order to give a good estimate of uncertainty. The processing load and disk storage requirements are considerable (Heuvelink, 1998). The consequence can be that the stochastic simulation approach is abandoned, because it is impractical, or a lower number of simulations are run giving a poor estimate of the degree of uncertainty in the results.

The second issue associated with stochastic simulation is that the number of realisations, and hence the number of simulations and output data sets, affects the degree of variation


157

in the output data sets (Heuvelink, 1998; Openshaw et al., 1991). Too few simulations will give lower variation leading to an underestimate of the degree of uncertainty. Heuvelink (1998) describes that in groundwater modelling 100 simulations are believed to give a good estimate of the mean result, 1000 simulations are required to estimate the results‟ variance and tens of thousands of simulations are required to estimate the 1% quantile. Nackaerts et al. (1999) present two methods for estimating the required number of simulations. The first method is only applicable to spatial modelling which produces Boolean results, such as viewshed modelling or watershed delineation, and stochastic simulation leads to an estimated probability that a cell lies within the viewshed/watershed. The simulations are considered as Bernoulli experiments and confidence intervals for the accuracy of the estimated probability are calculated. In the second method the variation in results is plotted against the number of simulations after each simulation. The appropriate number of simulations has been attained when the graph levels off, i.e. when variation in the results stabilises. The problem with this approach is that the required number of simulations is not known a priori. Therefore the required processing time and disk storage space cannot be estimated. Further research into estimating the number of simulations is required. A third issue relates to the PDF‟s random field function. Previous examples of applying Monte Carlo simulation have used a global DEM accuracy measure to represent the normal distribution from which realisations are created. Researchers have tended to use the RMSE as this is the accuracy measure commonly supplied with digital elevation data or DEMs from national mapping agencies such as the Ordnance Survey or USGS. However, the standard deviation of elevation error is more appropriate for a variable that does not necessarily have a mean of zero (§3.2.3.3). More importantly, as discussed in §4.1 and §4.2, a global accuracy measure gives no information about the spatial variation in accuracy across the DEM. Using an accuracy surface, such as those developed in Chapter 4, to generate alternative DEM realisations should give a more informed model of uncertainty. This has not been undertaken previously due to the absence of techniques for creating such spatially variable accuracy surfaces.

5.3 Research Rationale Previous research has used a global measure of a DEM‟s accuracy as the basis for modelling uncertainty in the outputs of DEM-based modelling and analysis. This is due


158

to the lack of a methodology for creating an accuracy surface, which represents local variation in DEM accuracy. The research presented in Chapter 4 gives a technique for producing such an accuracy surface. There is now the opportunity: 1. To develop tools and algorithms which use an accuracy surface to model uncertainty in DEM-based modelling outputs; and, 2. To investigate how uncertainty models based on locally variable accuracy information differ from those based on global accuracy measures. The development of new Monte Carlo simulation tools that use an accuracy surface will further the ability of DEM users to account for uncertainty in their results when making decisions. The investigation of the difference between uncertainty models using global and local accuracy will advance our understanding of how limited DEM quality affects uncertainty in the outputs of DEM-based modelling. Thus the research presented here will contribute to 3 of the five action points in Hunter‟s (1998) uncertainty management strategy described in §5.1, namely to develop formal, rigorous models of uncertainty; to understand how uncertainty propagates through spatial processing and decision making; and, to design techniques to assess the fitness for use of geographic information and reduce uncertainty to manageable levels for any given application.

5.4 Methodology The methods employed fall into three sections: Development of tools for Monte Carlo simulation of uncertainty in DEM derivatives using an accuracy surface; Application of the Monte Carlo simulation tool to a Mestersvig DEM and accuracy surface; Comparison of Monte Carlo simulation results based on an accuracy surface and equivalent results based on a global accuracy measure.

An ArcView extension developed by Suzanne Wechsler, a PhD student at State University of New York, was downloaded from the Internet1. This extension comprised numerous scripts including a set for Monte Carlo simulation of uncertainty in DEM derivatives using a global accuracy measure. These scripts were used as the basis for 1

The web page (http://web.syr.edu/~srperlit/), last accessed in October 1998, is no longer available, no alternative URL has been found, and the State University of New York web pages have no information about Suzanne Wechsler. The author will be glad to properly reference Suzanne Wechsler‟s ArcView extension and PhD thesis if information is acquired. Chapter 5 – Applying Knowledge of DEM Accuracy to Uncertainty Modelling

159

extending the DEM Uncertainty and Quality extension (Appendix 2) to include the following functionality: Determine an appropriate number of Monte Carlo simulations Create random fields Perform Monte Carlo simulation

5.4.1 Development of Monte Carlo Simulation Tools The ArcView tools that have been developed give three stages to the process of Monte Carlo simulation: 1. Determine the appropriate number of simulations; 2. Create random fields; 3. Perform the simulation.

5.4.1.1 Determine the Number of Simulations As described in §5.2.1.4, the number of simulations influences the quality of the uncertainty model. Insufficient simulations will underestimate the variability of the model outcomes, causing uncertainty to be under-estimated. However, Monte Carlo simulation is a time consuming and resource intensive procedure. Therefore, it is desirable to keep the number of simulations to a minimum.

A simple approach has been adopted to determine the appropriate number of simulations required. It is based on the assumption that when variability between the N DEM realisations levels off variability between N outcomes also levels off and therefore a sufficiently accurate estimate of uncertainty is achieved. For each cell location the standard deviation of values within the N realisations can be calculated. The overall standard deviation of values within this standard deviation grid provides an estimate of the variability. This variability can be calculated for a series of N values. Variability can be plotted against N on a graph. As the number of realisations increases the variability will at first decrease and then level off. The point at which the graph levels off represents the suitable number of realisations (Fig. 5.3).


160

Standard Deviation of Standard Deviation Grid

Appropriate number of realisations

Number of Realisations (N)

Fig. 5.3 Suitable number of realisations determined from a graph of N plotted against variability. The “Determine number of simulations (one value)” and “Determine number of simulations (range of values)” menu items provide two methods for employing this approach. The “one-value” method asks the user to specify a DEM, an accuracy surface and a number of realisations (N). A perturbation grid is created by randomly selecting a value for each cell from a Normal distribution with zero mean and standard deviation equal to the value stored in the corresponding accuracy surface cell. The perturbation grid is added to the DEM to create an alternative equally probable DEM realisation. This is repeated N times to create N DEM realisations. The standard deviation of the standard deviation of cell values for the N DEM realisations is then calculated and reported to the user. Repeating this method for a series of values of N, for example N = 10, 20, 30…150, gives the data to create the graph shown in Fig. 5.3 and determine the appropriate number of realisations. With the “range of values” method the user specifies a maximum number of realisations and the interval at which the standard deviation of the standard deviation grid should be calculated. For example, specifying 150 for the maximum number of simulations and 10 for the interval would give calculations of the standard deviation for N= 10, 20, 30 and so on to N = 150. Thus, the whole process of determining values for the graph can be run continuously from start to finish without user intervention. However, the process can take a long time. Running the process in stages may be preferable, in which case the “one value” method can be used.


161

5.4.1.2 Creating Random Fields The “Make Random Grids” menu item creates random fields that can then be input to the Monte Carlo simulation (§5.4.1.3) and used to perturb the DEM to create alternative DEM realisations. The user specifies the accuracy surface to use and the number of random fields to generate. The random fields are then generated using the same procedure as described in §5.4.1.1, saved to disk and added to the View. Creating the random fields independently of the Monte Carlo simulation functionality allows the same random fields to be used for more than one run of the simulation procedure.

5.4.1.3 Monte Carlo Simulation The “Monte Carlo simulation” menu item invokes a suite of ArcView scripts which simulate uncertainty in DEM derivatives using a Monte Carlo approach. The user specifies the DEM, the accuracy surface, the number of simulations (N) and the N random grids. To model the spatially correlated, rather than random, nature of DEM error a mean filter is applied to the random grids. The user specifies the radius of the filter window to apply, which should relate to the degree of spatial correlation in the DEM‟s error. The user also specifies which terrain parameters should be simulated. Any number of parameters from the following list can be chosen: elevation gradient upstream area topographic index watershed The topographic index, also known as the wetness index, is calculated using Equation 2.1. This index provides a measure of the degree of water retention and is commonly used in hydrological models (Burrough & McDonnell, 1998).

If the watershed option is chosen, the user must also specify a source cell image defining the outlet of the watershed to be delineated. Thresholds can be specified for elevation, gradient, upstream area and topographic index so that the probability that the threshold is exceeded can be calculated – see below for further details. If the user only chooses some of the derivatives, they are given the option of having the DEM realisations saved for later use.


162

The Monte Carlo procedure begins once all user specifications have been entered. First, the random grids are filtered and then added to the DEM to create N DEM realisations. Second, for each type of parameter N realisations are calculated. Third, a number of grids statistically summarising the variation between the N parameter realisations are computed. These summary statistics grids provide the information about the uncertainty of model outputs. The summary statistic grids calculated for each derivative are given in Table 5.1. Table 5.1 Monte Carlo Simulation Summary Statistics Grids Derivative Elevation Gradient Upstrea Topo. Watershed Statistic m Area index Bias √ √ √ √ R-bias √ √ √ √ ARAD √ √ √ √ RMSE √ √ √ √ R-RMSE √ √ √ √ L-RMSE √ √ √ √ Avg √ √ √ √ R-avg √ √ √ √ SD √ √ √ √ R-SD √ √ √ √ Max √ √ √ √ Min √ √ √ √ t-test √ √ √ √ Prob √ √ √ √ √ Bias measures the average deviation of the derivative values and is expressed as: Equation 5.1: Bias N

Yi Bias

Yî

i 1

N

where Yi is the derivative value obtained from DEM realization i, Yî is the derivative value obtained from the unperturbed DEM, and N is the number of simulations. Each cell in the bias grid represents the average amount of deviation expected for that derivative at that particular location.

R-bias is the relative bias and expresses the average percent deviation of the perturbed derivative from the unperturbed value and is expressed as:


163

Equation 5.2: Relative Bias N

Yî

i 1

R - Bias

Yî

Yi N

where Yi , Yî and N are as in Equation 5.1. Each value in the relative bias grid can be interpreted as the percentage by which that location is expected to deviate.

The average relative absolute difference (ARAD) between a perturbed derivative and the unperturbed value measures the average percent absolute deviation of the perturbed parameter (Desmet, 1997; Li, 1988).

This statistic is similar to the relative bias

however, all values are reported as positive. ARAD is expressed as: Equation 5.3: Average Relative Absolute Difference Yî

Yi

N

Yî

i 1

ARAD

N

where Yi , Yî and N are as in Equation 5.1. In this situation, Root Mean Square Error (RMSE) is one measure of the variation of derivative values and estimates the accuracy of each cell‟s derivative value. Use of RMSE is appropriate because a DEM‟s perturbation values are chosen from a PDF with a mean of zero and a normal distribution. RMSE is expressed as: Equation 5.4: Root Mean Square Error N

RMSE

ˆ Yi - Y i

2

i 1

N

where Yi , Yî and N are as in Equation 5.1. The relative root mean square error (R-RMSE) standardizes the RMSE to the unperturbed derivative value for each cell location (Kroll and Stedinger, 1996). The resulting R-RMSE value is a percentage and represents the standard variation of the derivative. R-RMSE is expressed as: Chapter 5 – Applying Knowledge of DEM Accuracy to Uncertainty Modelling

164

Equation 5.5: Relative Root Mean Square Error N

R

i 1

RMSE

Yî

Yi

2

Yî N

where Yi , Yî and N are as in Equation 5.1. The Log Root Mean Square Error (L-RMSE) statistic assigns more weight when the unperturbed derivative value is higher than the perturbed derivative value (Kroll and Stedinger, 1996). The perturbed value is standardized to the unperturbed value and the natural logarithm is computed. L-RMSE is expressed as: Equation 5.6: Log Root Mean Square Error 2 N i 1

L RMSE

Y ln i Yˆ i

N

where Yi , Yî and N are as in Equation 5.1. The average (Avg) and standard deviation (SD) of N simulations grids are surfaces where each cell represents either the average or standard deviation of all the perturbed values at that location. The average of N residual grids (R-avg) and standard deviation of N residual grids (RSD) are surfaces where each cell represents either the average or standard deviation of all the residuals between the perturbed and unperturbed derivative values at a particular location. These statistics are expressed in equations 5.7 and 5.8. Equation 5.7: Average of Residuals N

Yi R-avg =

Yî

i 1

N

where Yi , Yî and N are as in Equation 5.1. Equation 5.8: Standard Deviation of Residuals


165

N

(Yi

R-SD =

ˆ ) (Y Y i i

Yî )

2

i 1

N

where Yi , Yî and N are as in Equation 5.1. Maximum of N residuals (Max) and minimum of N residuals (Min) grids are surfaces where each cell represents either the maximum or the minimum residual for a particular parameter. A student's t-test is used to determine whether perturbed derivative values are significantly different from the unperturbed derivative. Three t-test result grids are computed. One grid (t) reports the t-statistic computed for each cell. The other two grids reclass those values as either significantly ("1") or not significantly different ("0") from the unperturbed grid at the =0.05 (t05) and =0.01 (t01) significance levels. The significance grids can be used to determine areas in a DEM where the impact of uncertainty may prove significant. The t-statistic is expressed as (Devore, 1987; Davis, 1986): Equation 5.9: Students t Statistic

t

ˆ Yi - Y i si

1 N

ˆ is the unperturbed where, Yi is the average of the N perturbed derivative values, Y i derivative value, si is the standard deviation of the derivative at each grid cell location and N is the number of simulations. Probability grids are calculated for all five of the derivatives. The watershed probability grid records the percentage of realizations in which a cell is identified as being within the watershed. The grid therefore represents the probability that a cell is in the watershed. This is the only output grid representing watershed uncertainty. Other output grids are inappropriate, because the watershed grid is Boolean – for each realisation a cell is either in the watershed or not. For elevation, gradient, upstream area and topographic index the probability grid represents the percentage of cells in which the user-specified threshold value is exceeded.

All of the above summary statistics grids are computed, saved to disk and displayed in the View.


166

5.4.2 Comparison of Simulations Based on Global and Local Accuracy The set of Monte Carlo simulation tools described in §5.4.1 were applied to a DEM of the Mestersvig study area and its accuracy surface. The Mestersvig data were used as the study area comprised a complete river catchment. The IDW512 DEM was used because the highest quality accuracy surface had been produced from this DEM (see §4.5.3). This DEM was produced using inverse distance weighting with a weight of 5 and a 12 point search radius. The simulation was run twice; once using the accuracy surface and once with a grid in which all cells had a value equal to the accuracy surface‟s average value, 23.03m. Therefore one run used a spatially variable model of DEM accuracy and the other used a single global accuracy measure. The summary statistics grids produced from the two Monte Carlo simulation runs were compared. This was achieved by visual examination of the grids and by subtracting the summary statistic grid of one run from the corresponding grid of the other run. This comparison allowed assessment of the influence of the more detailed accuracy information provided by the accuracy surface on Monte Carlo simulation results.

5.5 Results The application of the Monte Carlo simulation tool to the Mestersvig IDW512 DEM produced two sets of 57 output grids describing uncertainty in five terrain characteristics (elevation, gradient, upstream area, topographic index and watershed). The first set of output grids was produced using a global accuracy measure. The second set was produced using an accuracy surface. For each of the five terrain characteristics, the differences between the global accuracy measure simulation and the accuracy surface simulation are described below and the most informative of the output grids are presented. The two simulations are described below as global simulation and local simulation.

5.5.1 Elevation The RMSE of elevation (Fig. 5.4) and relative RMSE of elevation (Fig. 5.5) give a clear indication of the differences between the global and local Monte Carlo simulations of elevation uncertainty. The global simulation gives low RMSE values in the flattest parts of the Mestersvig study area and a random mixture of low to high RMSE values in all other areas (Fig. 5.4a). The global simulation uses the same probability density function for all cells to randomly choose a DEM perturbation value. In flat areas where neighbouring cells have highly similar elevation values the potential range of perturbed Chapter 5 – Applying Knowledge of DEM Accuracy to Uncertainty Modelling

167

values will be lowest. The mean filter applied to the perturbed values further reduces this range. Therefore the RMSE is lowest in these locations. In sloping areas the potential range of perturbed values is greater. Also the actual range of perturbed values found within the N DEM realisations is less predictable. Therefore, the RMSE values will tend to be higher and more variable.

In contrast the RMSE values derived from the local simulation gradually increase from flat to steep terrain (Fig. 5.4b). In the steepest areas the highest RMSE values are more prevalent on southeast facing slopes. The accuracy surface used in the local simulation was derived from multivariate regression of DEM error to terrain parameters, in which slope-based parameters have a strong influence. Therefore, the accuracy surface has higher values for the standard deviation of elevation error on steeper and more southeast facing slopes. As noted in §4.5.3.3, these higher values can be ascribed to localised variation in shadow on these slopes at the time of the aerial photography. Consequently, the Monte Carlo simulation produces RMSE values whose distribution is strongly correlated to gradient and aspect.

The global simulation gives relative RMSE values that decrease gradually from low to high ground (Fig. 5.5a). This indicates that, in relative terms, uncertainty is highest in the low-lying coastal areas and lowest on the mountain peaks. Across the whole study area the RMSE values vary over a narrow range (Table 5.2). The RMSE values are near to constant and therefore RMSE divided by a low elevation value is always greater than RMSE divided by a high elevation value. Relative RMSE is primarily the inverse of elevation. The local simulation gives a very different distribution of relative RMSE with highest values on the steep and southeast facing slopes and lowest values on flat ground (Fig. 5.5b). This is because the RMSE values vary over a greater range than for the global simulation (Table 5.2). Therefore the relative RMSE values are not simply the inverse of elevation, but reflect the influence of other terrain characteristics on uncertainty. Table 5.2 Summary statistics for frequency distribution of RMSE values for elevation uncertainty simulations. Minimum Maximum Range Mean Standard deviation Global 2.95 8.67 5.72 5.81 0.63 simulation Local 0.08 67.59 67.51 7.41 6.77 simulation


168

a)

b) Fig. 5.4 RMSE of Elevation: a) Based on global accuracy measure; b) Based on accuracy surface.


169

a)

b) Fig. 5.5 Relative RMSE of Elevation: a) Based on global accuracy measure; b) Based on accuracy surface.

5.5.2 Gradient The RMSE output grids for gradient uncertainty (Fig. 5.6) are broadly similar to those for elevation uncertainty (Fig. 5.4). However, there are some important differences.

The global simulation of gradient uncertainty RMSE grid is noisier than the elevation uncertainty equivalent and low RMSE values in the flattest areas are less prevalent. The derivative of a surface is noisier than the original surface (Wood, 1994) and Fig. 5.6a shows that this noise is passed on to the RMSE grid for that derivative. The local simulation of gradient uncertainty RMSE grid (Fig. 5.6b) has a very similar spatial distribution of values to that for elevation uncertainty (Fig. 5.4b). Lowest RMSE values


170

are found in the flattest areas and highest RMSE values are found on steep and southeast facing slopes. As for the local simulation of elevation uncertainty, a spatially correlated rather than random uncertainty model has been produced.

In the local and global simulations, the gradient RMSE tends to be higher than elevation RMSE, despite the lower potential range of original values (0o to 90o compared with 0m to 1180m. This can be seen by comparing the momental statistics in Table 5.3 with those in Table 5.2. This is because elevation RMSE is based on the accuracy of a single cell, while gradient RMSE is based on the accuracies of a cell and its eight neighbours.

Grids showing relative RMSE for gradient are shown in Fig. 5.7. For the global simulation (Fig. 5.7a), the relative RMSE is distinctly bimodal with low relative RMSE in sloping areas and high relative RMSE in flat areas. Relative RMSE is less distinctly bimodal for the local simulation (Fig. 5.7b). The presence of gradients less than 1o causes very high relative RMSE values in the flat areas for both simulations. The spatially variable model of DEM accuracy used in the local simulation mitigates, but does not remove, these high values. Neither of the relative RMSE grids provides a clear indication of the overall variability in gradient uncertainty. The RMSE grids are more suitable for this. However, the relative RMSE grids do illustrate that in flat areas a slight DEM error can have a relatively high impact on the derived gradient value. Such knowledge will be important in certain DEM-based modelling applications, such as flow modelling. Table 5.3 Summary statistics for frequency distribution of RMSE values for gradient uncertainty simulations. Minimum Maximum Range Mean Standard deviation Global 0.00 10.93 10.93 7.36 1.70 simulation Local 0.00 93.56 93.56 9.12 8.13 simulation


171

a)

b) Fig. 5.6 RMSE of Gradient: a) Based on global accuracy measure; b) Based on accuracy surface.


172

a)

b) Fig. 5.7 Relative RMSE of Gradient: a) Based on global accuracy measure; b) Based on accuracy surface.

5.5.3 Upstream Area Grids representing the RMSE of upstream area are shown in Fig. 5.8. The highest RMSE values occur in the vicinity of locations with a large upstream area, namely around river courses. So the RMSE grids look similar to the upstream area grids from which they were derived. Cells that are on a river course have upstream area values that are orders of magnitude larger than their neighbours. The 100 DEM realisations route flow along slightly different courses. Consequently in one realisation a cell may be defined as being in the path of a river, while in others it is not. In such a situation the variation of upstream area values will be very high.


173

The local and global simulations produce RMSE grids that are similar for most of the Mestersvig area. However, there are significant differences in areas of braided channels, namely the outlets of the Tunnel Elv and Lille Blydal and upstream of the gorge on the Store Blydal (see Appendix 3 for location of place names). These differences are most pronounced around the Tunnel Elv‟s three distributaries. The global simulation produces a broad band of high RMSE values in these braided regions. This indicates that the DEM realisations produce a wide range of potential outlet routes. The local simulation also produces a band of high RMSE values around the northern Tunnel Elv distributary. However, the course of the other two Tunnel Elv distributaries is more clearly defined by a narrow line of cells with high RMSE values. This tendency for the local simulation to give narrower strips of high uncertainty is also found upstream of the Store Blydal gorge and at the mouth of the Lille Blydal. The global simulation gives a number of river courses which bypass the gorge completely. These differences are a consequence of the higher degree of DEM uncertainty for global simulation in flat areas.


174

a)

b) Fig. 5.8 RMSE of Upstream Area: a) Based on global accuracy measure; b) Based on accuracy surface.

5.5.4 Topographic Index Grids representing the RMSE of topographic index values are shown in Figure 5.9. The distribution of values for both the global simulation (Fig. 5.9a) and the local simulation (Fig. 5.9b) is similar to the distribution of upstream area RMSE (Fig. 5.8a & b). High RMSE values are found in the vicinity of stream channels, particularly in braided areas. The areas of high RMSE are more widespread in the global simulation than the local simulation. The topographic index is the natural logarithm of upstream area divided by gradient (Equation 2.1). Upstream area grids have a much wider range of values than the gradient grids. Consequently, the distribution of topographic index values is predominantly coincident with the distribution of upstream area values. Also, upstream area RMSE is correlated with upstream area. Therefore it is to be expected that the


175

distribution of topographic index RMSE values is similar to the distribution of upstream area RMSE. In addition to the similarity between upstream area RMSE and topographic index RMSE, it can be seen that the mean and maximum RMSE values are higher for the global simulation than the local simulation. However, the difference grid (Fig. 5.9c) shows that RMSE for the global simulation is not consistently higher than RMSE for the local simulation. Global simulation RMSE is higher on flatter, wetter areas. Local simulation RMSE is higher on sloping, less saturated ground.


176

a)

b)

c) Fig. 5.9 RMSE of topographic index: a) Based on global accuracy measure; b) Based on accuracy surface; c) Difference between the two surfaces (global minus local).


177

5.5.5 Watershed Figure 5.10 shows the probability of lying within the Noret watershed for the global simulation (Fig. 5.10a) and the local simulation (Fig. 5.10b). The outline of the watershed boundary derived from the original IDW512 DEM is also shown. Both simulations show that the majority of the cells within the original IDW512 watershed have a high probability (greater than 0.9) of being within the watershed. However, both simulations produce areas of lower probability extending between 200m and 1.3km inside or outside the original watershed boundary. These areas of lower probability are most extensive where the watershed divide is located in areas of lower relief rather than sharp ridges. Also, the local simulation gives a number of patches of lower probability adjacent to Noret. These patches seem spurious and could result from artefacts of the pit removal algorithm influencing the quality of the accuracy surface at these locations. The difference grid (Fig. 5.10c) shows that the two simulations produce very similar watershed probability maps for the majority of the study area. However, there are large differences in probability for the areas of lower probability around the IDW512 watershed boundary. The local simulation probability values may be greater or less than those for the global simulation, but there is no clear pattern to these differences.


178

a)

b)

c) Fig. 5.10 Probability of being within the Noret watershed: a) Based on global accuracy measure; b) Based on accuracy surface; c) Difference between the two surfaces (global minus local). In all 3 maps the black line marks the watershed boundary derived from the original IDW512 DEM.


179

5.6 Issues Raised by the Comparison of Global and Local Simulation Results §5.5 has given a description of the results obtained. Additionally, some of the issues raised by the Monte Carlo simulations have been addressed, particularly the causes of differences between the results of the global and local simulations. These causes are specific to each topographic variable and could consequently be more clearly explained in conjunction with the results. There are additional, more generic issues raised by the comparison of the two simulation approaches, which are examined below.

5.6.1 The Quality of the Simulations By its nature, the quality of an uncertainty model is difficult to assess. Uncertainty modelling is used to represent our lack of knowledge about a DEM. Rigorous assessment of an uncertainty model would involve modelling our lack of knowledge about the extent of our lack of knowledge – clearly beyond the scope of scientific research. Consequently a descriptive assessment is made here of how reasonable the uncertainty models appear to be.

There are clear differences between the two models of DEM uncertainty as evidenced by the global simulation results and the local simulation results. But which simulation gives a better representation of DEM uncertainty? Both simulations give results for which the varying distribution of uncertainty as represented by the RMSE grids can be explained with reference to how the models are generated. Both uncertainty models can consequently be considered reasonable. However, as described in Chapter 4, DEM accuracy is spatially variable. Therefore, an accuracy surface gives a fuller representation of DEM quality than a global accuracy measure. Consequently, it is fair to assume that the local simulation better reflects what is known about the spatial distribution and extent of DEM accuracy and therefore provides a better quality uncertainty model. In modelling uncertainty, the spatial variation of this uncertainty is a primary concern. The uncertainty in a DEM derivative at a location is clearly a function of accuracy, but also a function of terrain character at that location. To illustrate this, consider the slope gradient between two cells 10m apart. If both cells have an accuracy of ± 1m, then the potential range of gradient values in Monte Carlo realisations is greatest when the two


180

cells have the same elevation and decreases as the height difference between the two cells increases (Fig. 5.11). Similar concepts apply to other terrain variables.

Scenario 1 Elevation at cell A = 10m ± 1m Elevation at cell B = 12m ± 1m Distance between cells = 10m Minimum possible gradient (A = 11m, B = 11m) = atan(0 / 10) = 0o Maximum possible gradient (A = 9m, B = 13m) = atan(4 / 10) = 21.8o Range of gradient values = 21.8 – 0 = 21.8o Scenario 2 Elevation at cell A = 10m ± 1m Elevation at cell B = 14m ± 1m Distance between cells = 10m Minimum possible slope (A = 11m, B = 13m) = atan(2 / 10) = 11.3o Maximum possible slope (A = 9m, B = 15m) = atan(6 / 10) = 31.0o Range of gradient values = 31.0 – 11.3 = 19.7o The graph below plots the minimum possible gradient, maximum possible gradient and range of gradient values for elevation differences of 0m to 30m. 80 70 60

Degrees

50 40

Minimum Gradient Maximum Gradient Gradient Range

30 20 10 0 -10

0

10

20

30

40

-20 Elevation Difference

As elevation difference increases, the possible range of gradient values decreases. Lower gradients tend to be more uncertain than steep gradients. Fig. 5.11 An illustration of the relationship between uncertainty and terrain character. In the global simulation DEM accuracy is the same for all locations. Therefore the spatial variation in uncertainty is only a function of terrain character and DEM accuracy plays no role. The local simulation approach uses spatially variable DEM accuracy and


181

consequently spatial variation in uncertainty is a function of both accuracy and terrain character. Local simulation clearly gives a more realistic representation of uncertainty. Indeed the global simulation approach seems to be of limited use.

5.6.2 Usefulness of the Simulation Results The Monte Carlo simulation tool developed for this research produces a large number of output grids as described in §5.4.1.3. However, only a small number of these grids have been presented in the results. For most topographic variables, the RMSE grids provide sufficient information for assessing the difference between the two simulations. This does not imply that other output grids, such as ARAD, bias etc., are not useful. These other output grids are not necessary when comparing two simulations, but could provide valuable further detail about the nature of uncertainty when considering the consequences of uncertainty in a particular application. For example, a small alteration in a cell‟s elevation can have a high impact in low relief areas. The direction of a stream‟s flow could be altered. In areas of higher relief the same small alteration in a cell‟s elevation is unlikely to alter a stream‟s flow direction. Relative RMSE gives a better representation of this local impact of elevation uncertainty, while RMSE gives a good representation of the actual extent of the uncertainty.

5.6.3 Applying Uncertainty Knowledge The results of the Monte Carlo simulation based uncertainty modelling provide information describing the degree of confidence one can have in the variables derived from a DEM. This generic information can be further used in an application specific decision making context. The information on uncertainty allows the evaluation of best and worst-case scenarios, or the adoption of a risk-taking or risk-averse decision-making strategy. Two examples of applying uncertainty knowledge in this way are given below.

A government funded study into the impacts of sea level rise could benefit from definition of best and worst case scenarios. For a given predicted amount of sea level rise the uncertainty model could be used to identify grid cells with a greater than 0.95 probability of being below this level (the best case scenario) and then the grid cells with a greater than 0.05 probability (the worst case scenario). Alternatively, the government


182

could define the level of risk it is willing to take when identifying areas that could be inundated and set the probability threshold accordingly.

Delineation of watershed extent provides a second example of applying uncertainty knowledge. The extent of a watershed is an important factor in identifying suitable sites for a reservoir. The watershed extent determines the quantity of water available. A larger watershed provides a greater water supply, assuming that other influential factors such as rainfall and runoff are constant. It would be undesirable to overestimate watershed extent and consequently overestimate the water supply. Therefore a risk-averse or worst case scenario would entail identifying grid cells with a high probability of lying within the watershed. Alternatively, one may be identifying locations for a potentially polluting industrial facility. In this situation one would want to make sure that the facility was not located within the catchment of a water supply, such as a reservoir. The risk-averse scenario would entail identifying all grid cells with at least a low probability of lying within the reservoir‟s watershed. These two scenarios are illustrated in Fig. 5.12, showing cells with a probability greater than 0.95 of lying within the watershed and those with a probability greater than 0.05. The high probability watershed covers 137.9 km2. The low probability watershed covers 156.8km2.and is therefore 18.9km2 or 14% larger. It is worth noting that even where the watershed boundary is composed of steep ridges, for example the most northern part of the boundary, there is a clear separation of the two watershed extents. The accuracy of the DEM causes significant uncertainty in the location of the watershed boundary even in areas of steep mountain terrain.

Fig. 5.12 Probability-based watersheds.


183

5.6.4 Other Approaches to Modelling Uncertainty Being able to create a spatially variable accuracy surface opens up new approaches to incorporating DEM uncertainty in spatial modelling. Two possible new approaches are described below, but these approaches have not been implemented in the research presented. A DEM‟s accuracy surface could be used to refine elevation thresholding applications and to develop a new flow accumulation algorithm.

Eastman (1999) describes how a DEM with known RMSE can be used to evaluate the probability that a cell is above or below a specified elevation value. This analysis can be used in flood modelling or analysing impacts of sea level rise. For example, suppose sea level is predicted to rise by 2m and a DEM of a coastal area exists with a global RMSE of 1.5m. Eastman (1999) assumes that DEM error is normally distributed and RMSE is equivalent to the standard deviation of this normal distribution. A probability map can be generated in which each cell has a value equal to the probability that elevation at that location is less than 2m and would therefore become submerged by the increase in sea level. This probability map can then be reclassified to create a map of submerged risk based on the degree of risk one is willing to take. A risk averse strategy would reclassify a cell as inundated even if there was only a 1% chance of that cell being less than 2m. Because a global RMSE value is used and 99% of values in a normal distribution lie within 3 standard deviations of the mean, the risk averse inundation map would in fact simply follow the 6.5m contour (2m + (3 x 1.5m)). There is no need to create the probability map. One can simply decide on a confidence interval that reflects the degree of risk one wishes to take, add or subtract the corresponding number of standard deviations to the elevation threshold, then reclassify the DEM to find cells which are below this adjusted threshold.

More sophisticated modelling of whether cells are above or below an elevation threshold is possible when a DEM‟s accuracy surface is available. It is necessary to generate a probability surface, because cells of equal elevation may not have the same standard deviation and therefore will have different probabilities of exceeding the threshold. Flow accumulation modelling is a second potential application of a DEM‟s accuracy surface. Rather than modelling the uncertainty in the results of a DEM-based


184

application, there is the potential to incorporate uncertainty about the input data into the processing algorithm.

Flow accumulation models the overland flow of water through a catchment. The technique is based on determining the direction of water flow from each cell to its neighbours. There are a number of techniques for determining flow direction that have been reviewed elsewhere (Burrough & McDonnell, 1998; Garbrecht & Martz, 1999; Rieger, 2000). The most common technique is known as the D8 algorithm (Martz & Garbrecht, 2000). Water is routed from a cell to its steepest downhill neighbour. This algorithm has been criticised for only allowing flow to travel in one of eight directions, tending to yield parallel flow paths on uniform slopes and only allowing convergent, not divergent, flow (Burrough & McDonnell, 1998; Fairfield & Leymarie, 1991; Tribe, 1992). Despite these significant drawbacks the algorithm is implemented in a number of GIS software packages, including ArcView and Idrisi (Burrough & McDonnell, 1998; Eastman, 1999a; ESRI, 1998). Multiple flow direction algorithms can model divergent flow by allowing flow from one cell to more than one neighbour (Rieger, 2000). This also mitigates the problem of parallel flow paths. The F8 algorithm is a type of multiple flow direction algorithm. Water flows from a cell to all its downhill neighbours with the proportion of flow to any one neighbouring cell being weighted by height of the downhill drop (Burrough & McDonnell, 1998). A DEM‟s accuracy surface allows a new type of flow accumulation algorithm, based on the F8 principle. The proportion of flow to each downhill neighbour, or indeed uphill neighbours too, could be weighted according to the probability that a neighbouring cell is lower than the cell in question.

5.7 Conclusions The research presented in this chapter has used a Monte Carlo simulation approach to model the uncertainty associated with topographic variables derived from a DEM. An ArcView extension has been developed as a tool for implementing Monte Carlo simulation and producing a number of output grids that characterise the uncertainty associated with elevation, gradient, upstream area, topographic index and watershed delineation.


185

The Monte Carlo simulation tool has been applied using two alternative representations of a DEM‟s accuracy: a single standard deviation value, describing the global accuracy of the DEM (global simulation), and; an accuracy surface describing the spatial variation in standard deviation of error across the DEM (local simulation). 100 realisations of each topographic variable were produced. RMSE grids, representing the variation in these topographic variable realisations, show that there are major differences between the results of the global and local simulations of uncertainty. Although it is not possible to give an exact assessment of the validity of either uncertainty model, it can be assumed that the local simulation gives a more realistic representation of topographic variable uncertainty, because DEM accuracy is known to be spatially variable (see Chapter 4). The uncertainty associated with a DEM-derived variable is a function of DEM accuracy and terrain character. Global simulation only models the terrain character component of uncertainty. Consequently, it can be concluded that global simulation gives a partial representation of uncertainty associated with DEM-based modelling. A local simulation approach, using a spatially distributed model of DEM accuracy, should be used.


186

Chapter 6: Advances in the Understanding and Handling of DEM Quality and Uncertainty DEMs are widely used in GIS environmental modelling applications (Moore et al., 1991; Stocks & Heywood, 1994; Weibel & Heller, 1991). Terrain form has an important influence on environmental processes and on the distribution of environmental phenomena. DEMs can be used to derive a variety of terrain variables and indices, which in turn can be used in spatial modelling of this influence (Burrough & McDonnell, 1998). DEM derivatives are sensitive to errors in DEMs (Wood, 1994). Due to this sensitivity and the wide-ranging role of DEMs, the quality of DEMs and uncertainty about the quality of the DEM-based environmental modelling applications are important issues. However, knowledge of DEM error, quality and uncertainty and the ability to handle these issues are at a primitive stage (Heywood, et al., 1998).

The research presented in this thesis aims to improve knowledge of DEM errors and DEM quality and increase understanding of how limited DEM quality affects uncertainty associated with the outcomes of DEM-based spatial models. The thesis presents methodologies and software tools to help quantify and communicate DEM quality and uncertainty.

This final chapter brings all the research together by summarising key findings and issues in §6.1 to §6.3. §6.4 then identifies areas for further study.

The key findings of the research fall into three categories: 

Methodologies and tools for assessing DEM quality;



Systematic investigation of the causes of reduced DEM quality;



Modelling uncertainty in DEM-derived topographic variables.

6.1 Methodologies and Tools for Assessing DEM Quality Chapter 3 presents an holistic approach to assessing the quality of a DEM. The common practice of reporting the RMSE of a DEM gives an inadequate description of a DEM‟s quality. DEM quality involves both the accuracy and the geomorphometric

References

187

characteristics of the surface model. A more detailed and comprehensive report on the quality of a DEM is required. The report can be undertaken at one of two levels. The first level involves assessing the geomorphometric quality of a DEM. This assessment is simple to undertake and involves no additional data. The assessment should be undertaken in all situations for all DEMs. The second level involves assessing both the geomorphometric quality and the accuracy of a DEM. The accuracy assessment is more complex and involves either fieldwork or purchase of higher accuracy data. Accuracy assessment is beneficial in all situations, but should be regarded as a necessity only where insufficiently accurate elevation values and DEM-derived topographic variables are of major consequence. Accuracy assessment also constitutes an initial step to modelling uncertainty in DEM-based spatial modelling outcomes (see §6.3). The methods involved in geomorphometric quality assessment and accuracy assessment are stated below with reference to use of the “DEMUncQual” ArcView extension that has been developed as a set of tools for assessing and modelling DEM quality and uncertainty.

6.1.1 Geomorphometric Quality Assessment Geomorphometric quality assessment considers the general form of the surface model. The first step is to generate displays of the DEM and derivative grids. Recommended displays are: 

An orthographic pseudo-perspective display of the whole DEM and zoomed in views of small sub-areas;



2-D rendering of the gradient grid;



2-D rendering of the aspect grid.

When viewing these displays the user should be looking for the following: 

Representation of all major landform features;



The way in which individual landforms are represented;



Presence of interpolation artefacts (e.g. terracing, flat topped peaks, ramps, pits).

This visual approach to DEM quality assessment allows the user of a DEM to become familiar with the way in which the DEM represents landforms and provides the opportunity for identifying particular issues or problems that could arise in subsequent DEM-based analysis.

References

188

Calculation of three geomorphometric indices (contour bias, flatness and pit volume) represents a quantitative and less subjective means of quality assessment. Once calculated these indices can be recorded in the DEM‟s documentation, such as ArcView‟s theme properties or Idrisi‟s raster documentation file. The DEMUncQual ArcView extension gives a “Geomorphometric Quality Indices” menu item, which calculates the indices, reports the results on screen and stores the results as part of the theme‟s properties. The geomorphometric indices allow comparison of DEMs. However, the indices are a function of terrain character as well as DEM quality. Therefore, they should be used with care when comparing DEMs of different areas.

6.1.2 Accuracy Assessment Accuracy assessment involves comparing the elevations of a sample of cells from the DEM with elevation values recorded at a higher degree of accuracy. The highest practical accuracy can be recorded by differential correction of GPS carrier phase data. Where this approach is not feasible the higher accuracy data may be obtained from another source of digital elevation data, such as contour vertices not used in the DEM generation

process,

another

higher

accuracy

DEM

or

elevation

values

photogrammetrically derived from large scale aerial photography. However, such secondary sources of data will not represent the full accuracy of the DEM with relation to “true” elevation, as they themselves will be subject to errors.

Equations derived by Li (1991; Equation 3.2 and Equation 3.3) can be used to determine an appropriate number of sample points at which accuracy is calculated. The sample points should be distributed evenly over the study area and represent the variety of terrain types present. The “Accuracy Measures” menu item in the DEMUncQual ArcView extension calculates three accuracy measures (standard deviation of error, reliability index and mean error). The results are reported on screen and stored as part of the DEM‟s theme properties. These global accuracy measures, summarising accuracy across the whole DEM, are useful for quickly considering a DEM‟s accuracy and comparing two or more DEMs.

References

189

An accuracy surface describes the variation in accuracy across the DEM and represents a more detailed description of accuracy. This research has shown that there is a relationship between DEM accuracy and terrain characteristics. A multiple regression modelling approach to defining this relationship can be used to create an accuracy surface. The main steps involved in this approach are: 

Use the DEMUncQual “Calculate DEM Errors and Derive Terrain Parameters” menu item to create a table of error and terrain parameter values for the previously used sample point locations.



Use a statistics package‟s best subsets or stepwise regression modelling functionality to determine a 20 variable regression equation for modelling the relationship between error and terrain character.



Use DEMUncQual‟s “Create SD Surface” menu item to create a standard deviation of error accuracy surface from the regression equation.

This accuracy surface gives a detailed representation of how DEM accuracy varies across the study area. The surface can also be used in uncertainty simulation (§6.3).

6.1.3 A DEM Quality Report The preceding sections summarise how DEM quality assessment should be undertaken. In addition to storing geomorphometric quality indices and accuracy measures in the DEM‟s metadata documentation, a quality report should be produced for the user‟s own purposes and for communicating information to other users of the DEM. A complete quality report should contain the following elements: 

Lineage of the DEM creation process: origin and date of source data; description of source data‟s pattern and density; and, interpolation method, parameters and software;



A written statement describing the representation of landforms, the general nature of the terrain surface model and the presence of artefacts;



Geomorphometric indices: contour bias; flatness index; and, pit volume;



Accuracy measures: mean error; standard deviation of error; and, reliability index;



Location and name of accuracy surface grid.

The first three elements should be provided in all DEM quality reports. The last two elements involve more onerous work and are only a necessity for DEM applications in

References

190

which the accuracy of the outcome is of importance. An example of a DEM quality report is shown in Appendix 8.

6.2 Systematic Investigation of the Causes of Reduced DEM Quality A systematic investigation of the causes and extent of decreased DEM quality has been presented in Chapter 3. Every DEM is unique in terms of the terrain being modelled or the technique and data used to generate the DEM. Nonetheless, some generic observations can be made about the impact of source data and interpolation technique on the quality of the resulting DEM:

Spline interpolation produces smooth, rounded surfaces with spurious pits; Interpolating using splines with tension, as opposed to regularised splines, decreases smoothness, creates smaller pit volumes and increases accuracy; Inverse distance weighting produces more angular surfaces with puddles and terraces; Linear contour interpolation produces highly angular surfaces and is prone to artefacts such as ramps and channels; Increasing an interpolation algorithm‟s search radius increases smoothness, but decreases accuracy; Increasing the inverse distance weight increases accuracy, but decreases smoothness; Increasing the density of source data significantly increases accuracy and can increase geomorphometric quality; DEMs produced using the same data and methods but with different software will not necessarily be the same.

A key finding is that issues associated with choice of source data and choice of interpolation method and associated parameters have a wide-ranging impact on DEM quality. There are potentially wide ranging differences between DEMs of the same area. This reinforces the importance of assessing DEM quality before the DEM can be applied in a confident and sensible manner.

References

191

6.3 Modelling Uncertainty in DEM-derived Topographic Variables The comprehensive approach to assessing DEM quality summarised in §6.1 significantly enhances knowledge of a DEM‟s quality. However, understanding of what represents adequate quality and how this level of quality influences the reliability of spatial modelling outcomes will always be incomplete. This incomplete understanding can be described as uncertainty. Monte Carlo simulation provides an appropriate and effective means of modelling this uncertainty. Chapter 5 examined the use of an accuracy surface in Monte Carlo simulation of uncertainty in DEM-derived topographic variables. §6.3.1 summarises the stages and tools involved in this type of Monte Carlo simulation. §6.3.2 describes the key findings of applying this uncertainty modelling approach.

6.3.1 Monte Carlo Simulation The DEMUncQual ArcView extension provides several tools to implement Monte Carlo simulation of uncertainty in topographic variables. The inputs to the simulation are a DEM and its accuracy surface, created as described in §6.1.2. The following stages are then followed to carry out the simulation: 

Determine the required number of realisations;



Create random grids for perturbing the DEM;



Perform the simulation.

The extension allows modelling of uncertainty associated with seven topographic variables: elevation, gradient, aspect, flow direction, upstream area, topographic index and watershed delineation. The output of the simulation is a collection of grids describing the variability of each topographic variable in terms of several momental statistics.

6.3.2 Uncertainty in Topographic Variables Applying the Monte Carlo simulation tool to a DEM of the Mestersvig area, north east Greenland, has revealed three key findings.

First, the models of topographic variable uncertainty created using an accuracy surface are markedly different to models created using a global accuracy measure. The differences illustrate that a global accuracy measure is a limited descriptor of a DEM‟s

References

192

accuracy and consequently gives a poor representation of uncertainty. Using an accuracy surface gives a more detailed and truer representation of this uncertainty.

Second, RMSE grids give an overall picture of the spatially variable degree of uncertainty, while relative RMSE grids give an overall picture of the local significance of uncertainty. Other output grids allow a more detailed examination of the nature and potential consequences of uncertainty.

Third, as advocated by Eastman (1999), modelling uncertainty associated with topographic variables allows a move from hard decisions to soft, probabilistic results from which decisions can be made based on the level of risk that can be taken. Examples of such an approach, presented in §5.6.3, are the delineation of high probability and low probability watershed boundaries and best- and worst-case predictions of sea level rise impacts.

6.4 Further Work The research presented in this thesis has contributed to the body of knowledge concerning the quality of DEMs and the nature of uncertainty in DEM-based spatial modelling. A number of tools are presented that help assess DEM quality and permit modelling of uncertainty in topographic variables. However, DEM quality and uncertainty modelling are extensive and complex fields of study and there is much potential for further work. Some areas, directly related to the research presented here, where further work would be beneficial are described below.

6.4.1 Quality Assessment Work 1. Accuracy assessment in Chapters 3 and 4 used field-surveyed GPS elevation measurements. This gives the best practical assessment of the deviation of a DEM from “true” elevation. Most of the DEMs were produced from approximately 50% of the contour vertices available in the original contour line data. The accuracy assessment could be repeated using the discarded contour vertices. The difference between the two assessments should indicate how much inaccuracy is due to the limited quality of the source data and how much is due to the DEM generation process.

References

193

2. The accuracy assessment involved using equations by Li (1991) to determine an appropriate number of GPS sample points for estimating global accuracy measures. The same number of sample points was then used to create the accuracy surface. Li did not derive the equations with this purpose in mind. It is not known whether a sufficient number of sample points have been used in creating the accuracy surface. Further work is needed to investigate the influence of number of sample points on the reliability of the accuracy surface. 3. In determining the relationship between DEM error and terrain character, terrain parameters have been measured at four spatial scales by applying mean filters with windows of 5, 10 and 20 cell radii. The filter window sizes were chosen simply to give a range of measurement scales. Further work is needed to investigate the relationship between DEM error and terrain parameter measurement scale.

6.4.2 Uncertainty Modelling Work 1. A simple method has been adopted to define the number of Monte Carlo realisations required to give reliable uncertainty models. The assumption is that variability in simulation results levels out when the variation between DEM realisations levels out. Research is required to check that this assumption is true. 2. A basic Monte Carlo simulation tool has been developed for investigating uncertainty in certain topographic variables. Further work in developing a generic tool for Monte Carlo simulation of any sequence of spatial analysis operations using a DEM, or indeed other data sets, would be very beneficial to the GIS research community. Availability of such a tool would hopefully kickstart uncertainty modelling and investigation in a wide variety of application areas. 3. Monte Carlo simulation is one of a number of possible applications of a DEM‟s accuracy surface. Two other potential applications were described in §5.6.4: probabilistic thresholding techniques and a probability-based flow accumulation algorithm. Further work is required to develop tools implementing these applications of an accuracy surface and also to assess the validity and value of these modelling approaches.

References

194

The work presented in this thesis was undertaken in light of the limited knowledge and understanding of DEM uncertainty. This primitive stage is at odds with widespread and critical use made of DEMs. The work presented is intended to add to the body of knowledge concerned with DEM uncertainty and hopefully the software tools described encourage further investigation and facilitate DEM-users out there in the non-academic “real world” to consider and take account of DEM quality and the associated uncertainty.

References

195

References Acevedo, W., 1991, First assessment of US Geological Survey 30-minute DEMs: a great improvement over existing 1-degree data. Proceedings of the 1991 ACSM/ASPRS Annual Conference Volume 2, Baltimore, Maryland, pp. 1-12. Adkins, K.F. & Merry, C.J., 1994, Accuracy assessment of elevation data sets using the Global Positioning System. Photogrammetric Engineering and Remote Sensing, 60(2), 195-202. Allen, J. & Shears, J., 1995, A digital view: softcopy photogrammetry in GIS. GIS Europe, September 1995, 48-51. Balce, A.E., 1987, Determination of optimum sampling interval in grid Digital Elevation Models (DEM) data acquisition. Photogrammetric Engineering and Remote Sensing, 53, 323-330. Band, L.E., 1986, Topographic partition of watersheds with digital elevation models. Water Resources Research, 22, 15-24. Band, L.E., Vertessy, R. & Lammers, R.B., 1995, The effect of different terrain representations on simulated watershed processes. Zeitschrift für Geomorphologie. December 1995, 187-199. Blais, J.A.R., Chapman, M.A. & Lam, W.K., 1986, Optimum interval sampling in theory and practice. Proceedings of the 2nd International Symposium on Spatial Data Handling. International Geographical Union, Columbus, Ohio, USA, pp. 185-192. Blakemore, M., 1984, Generalization error in spatial databases. Cartographica, 21, 131139. Burrough, P.A., 1986, Principles of Geographic Information Systems for Land Resources Assessment. Oxford University Press, Oxford. Burrough, P.A. & McDonnell, R.A., 1998, Principles of Geographical Information Systems. Oxford University Press, Oxford. Campbell, J., 1991, Introductory Cartography, W.C. Brown, Iowa. Carlisle, B. and Jordan, G., 1998, Can't see the sky for the trees. Mapping Awareness, 12(1), 26- 27. Carrara, A., Bitelli, G. and Carla, R., 1997, Comparison of techniques for generating digital terrain models from contour lines. International Journal of Geographical Information Science, 11(5), 451-473. Carter, J.R., 1988, Digital representations of topographic surfaces. Photogrammetric Engineering and Remote Sensing, 54(11), 1577-1580.

References

196

Chrisman, N.R., 1982, Methods of Spatial Analysis Based on Errors in Categorical Maps. Unpublished PhD thesis, University of Bristol. Chrisman, N.R., 1983, Epsilon filtering: a technique for automated scale changing. Proceedings of 1983 ACSM Annual Meeting, 322-331. Clark, K.J., 1993, Data constraints on GIS application development. In: Kovar, K. and Nachtnebel, H.P. (Ed.s), Application of Geographic Information Systems in Hydrology and Water Resources Management. IAHS Publication 211, 451-463. Clark Labs, 2001, Idrisi32 Version2 Online Help, Clark University, Massachusetts. Clarke, K.C., 1986, Computation of the fractal dimension of topographic surfaces using the triangular prism surface area. Computers and Geosciences, 12(5), 713-722. Clarke, K.C., 1990, Analytical and Computer Cartography, Prentice-Hall, London. Davis, J., 1986, Statistics and Data Analysis in Geology. Second Edition, John Wiley and Sons, New York. Day, T. & Muller, J.P., 1988,. Quality assessment of digital elevation models produced by automatic stereomatchers from SPOT image pairs. Photogrammetric Record, 12, 797-808. Desmet, P., 1997, Effects of Interpolation Errors on the Analysis of DEMs. Earth Surface Processes and Landforms, 22, 563-580. Devore, J., 1987, Probability and Statistics for Engineering and the Sciences. Brooks Cole Publishing Company, Monterey, CA. Dozier, J., 1980, A clear-sky spectral solar radiation model for snow-covered mountainous terrain. Water Resources Research, 16, 709-718. Dozier, J. & Frew, J., 1990, Rapid calculation of terrain parameters for radiation modelling from digital elevation data. IEEE Transactions on Geoscience and Remote Sensing, 28, 963-969. Dubayah, R. & Rich, P.M., 1995, Topographic solar radiation models for GIS. International Journal of Geographical Information Science, 9, 405-419. Eastman, J.R., 1999a, Guide to GIS and Image Processing Volume 1. Clark Labs, Massachusetts. Eastman, J.R., 1999b, Guide to GIS and Image Processing Volume 2. Clark Labs, Massachusetts. Eastman, J.R., Kyem, P.A.K., Toledano, J. & Jin, W., 1993, Explorations in Geographical Information Systems, volume 4: GIS and Decision Making. UNITAR (United Nations Institute for Training and Research), Geneva. Ebner, H., 1992, Digital terrain models and their applications. GIS, 5(3), 27-30.

References

197

Ebner, H. & Reinhardt, W., 1984, Progressive sampling and DEM interpolation by finite elements. International Archives of Photogrammetry and Remote Sensing, 25(A4), 125-134. EDINA, 2001, DigiMap, http://edina.ac.uk/digimap Last accessed 23.03.02 Ehlschlaeger, C.R. & Goodchild, M.F., 1994, Uncertainty in spatial data: defining, visualizing, and managing data errors. Proceedings of GIS/LIS 1994, Phoenix, Arizona, pp. 246-253. http://www.sbg.ac.at/geo/people/elorup/diss/lit/ehl_good_1994_2/gislis.html Last accessed 26.08.98

Ehlschlaeger, C.R. and Shortridge, A., 1997, Modelling elevation uncertainty in geographical analyses. In: Kraak, M.J. and Molenaar, M. (Ed.s), Advances in GIS Research, Proceedings of the 7th International Symposium on Spatial Data Handling. Taylor and Francis, London, pp. 585-595. http://everest.hunter.cuny.edu/~chuck/SDH96/paper.html Last accessed 26.08.98 Eklundh, L. & Mårtensson, U., 1995, Rapid generation of digital elevation models from topographic maps. International Journal of Geographic Information Systems, 9(3), 329-340. Ekstrand, S., 1996, Landsat TM-based forest damage assessment: correction for topographic effects. Photogrammetric Engineering and Remote Sensing, 62(2), 151-161. ESRI, 1998, ArcView 3.1 Online Help, ESRI, Redlands, California. Evans, I.S., 1972, General geomorphometry, derivatives of altitude, and descriptive statistics. In: Chorley, R.J. (Ed.), Spatial Analysis in Geomorphology. Methuen, London, pp. 17-90. Evans, I.S., 1981, An integrated system of terrain analysis and slope mapping. Zeitschrift für Geomorphologie, Supplementband 36, pp. 274-295. Fairfield, J. & Leymarie, P., 1991, Drainage networks from grid digital elevation models. Water Resources Research, 30(6), 1681-1692. Fisher, P., 1991, First experiments in viewshed uncertainty: the accuracy of the viewshed area. Photogrammetric Engineering and Remote Sensing, 57, 13211327. Fisher, P., 1992, First experiments in viewshed uncertainty: simulating fuzzy viewsheds. Photogrammetric Engineering and Remote Sensing, 58, 345-352. Fisher, P., 1993, Algorithm and implementation uncertainty in viewshed analysis. International Journal of Geographical Information Systems, 7, 331-347. Fisher, P.F., 1994, Probable and fuzzy methods of the viewshed operation. In: Worboys, M.F. (Ed.), Innovations in GIS 1. Taylor and Francis, pp. 167-176.

References

198

Florinsky, I.V., 1998, Accuracy of local topographic variables derived from digital elevation models. International Journal of Geographical Information Science, 12(1), 47-61. Florinsky, I.V. and Kuryakova, G.A., 2000, Determination of grid size for digital terrain modelling in landscape investigations – exemplified by soil moisture distribution at a micro-scale. International Journal of Geographical Information Science, 14(8), 815-832. Foote, K.E. & Huebner, D.J., 1997a, Error, accuracy and precision. The Geographers Craft, University of Texas, Austin. http://www.utexas.edu/depts/grg/gcraft/notes/error/error.html Last accessed 28.08.98 Foote, K.E. & Huebner, D.J., 1997b, Managing error. The Geographers Craft, University of Texas, Austin. http://www.utexas.edu/depts/grg/gcraft/notes/manerror/manerror.html Last accessed 26.08.98 Forman, R.T.T., 1995, Land Mosaics: the ecology of landscapes and regions. Cambridge University Press, Cambridge. Frederiksen, P., Jacobi, O. & Kubik, K., 1983, Measuring terrain roughness by topological dimension. ISPRS Working Group III/3 Colloquium, Stockholm, Sweden. Frederiksen, P., Jacobi, O. & Kubik, K., 1984, Measuring Terrain Roughness by Topological Dimension. Preprint, Technical University of Denmark and Aalborg University Centre, Denmark. Fritsch, D., 1984, Proposal for the determination of the least sampling interval for DEM data acquisition. Workshop on Digital Elevation Models, Alberta Bureau of Surveying and Mapping, Edmonton, Canada. Fukushima, Y., 1988. Generation of DTM using SPOT Image near Mt. Fuji by digital image correlation. International Archives of Photogrammetry and Remote Sensing, 27(B3), 225-234. Gao, J., 1997, Resolution and accuracy of terrain representation by grid DEMs ata a micro-scale. International Journal of Geographical Information Science, 11, 199212. Garbrecht, J. & Martz, L.W., 1999, Digital elevation model issues in water resources modelling. Proceedings of 19th ESRI International User Conference, Environmental Systems Research Institute, San Diego, California, July 1999. http://grl1.grl.ars.usda.gov/topaz/esri/paperH.htm. Last accessed 12.4.2000. Giles, P.T. & Franklin, S.E., 1996, Comparison of derivative topographic surfaces of a DEM generated from stereoscopic SPOT images with field measurements. Photogrammetric Engineering and Remote Sensing, 62(10), 1165-1171.

References

199

Goodchild, M.F., 1993, Data models and data quality: problems and prospects. In: Goodchild, M.F., Parks, B.O. & Steyaert, L.T. (Ed.s), Environmental Modeling with GIS, Oxford University Press, Oxford, pp. 94-103. Goodchild, M., Buttenfield, B. & Wood, J., 1994, Introduction to visualizing data validity. In: Hearnshaw, H.M. & Unwin, D.J. (Ed.s), Visualisation in Geographical Information Systems. John Wiley & Sons, Chichester, pp. 141-149. Goodchild, M.F. and Gopal, S., 1989, The Accuracy of Spatial Databases. Taylor and Francis, London. Gurnell, A.M. & Montgomery, D.R., 2000, Hydrological Applications of GIS. Wiley, Chichester. Guth, P., 1992, Spatial analysis of DEM error. Proceedings of ASPRS/ACSM Annual Meeting, Washington DC, 187-196. Guth, P., 1995, Slope and aspect calculations on gridded digital elevation models: examples from a geomorphometric toolbox for personal computers. Zeitschrift für Geomorphologie. December 1995, 31-52. Hartshorne, J., 1996, Assessing the influence of digital terrain model characteristics on tropical slope stability analysis. Proceedings of the GIS Research UK 1996 Conference, University of Kent at Canterbury, pp. 3-27. Hearnshaw, H.M. & Unwin, D.J., 1994, Visualization in Geographical Information Systems. John Wiley & Sons, Chichester. Heuvelink, G.B.M., 1998, Error Propagation in Environmental Modelling with GIS. Taylor and Francis, London. Heuvelink, G.B.M. and Burrough, P.A., 1993, Error propagation in cartographic modelling using Boolean logic and continuous classification. International Journal of Geographical Information Systems, 7(3), 231-246. Heywood, I., Cornelius, S. & Carver, S., 1998, An Introduction to Geographical Information Systems. Longman, Harlow. Heywood, D.I., Smith, G., Carlisle, B. & Jordan, G., 1999, Global Positioning Systems as a practical fieldwork Tool: applications in mountain environments. In: M. Pacione (Ed.), Applied Geography: Principles and Practice. Routledge, London, pp. 593 - 604. Hodgson, M.E., 1995, What cell size does the computed slope/aspect angle represent? Photgrammetric Engineering and Remote Sensing, 61, 513-517. Horn, B.K., 1982, Hill shading and the reflectance map. GeoProcessing 2, 65-146. Hunter, G.J., 1998, Managing uncertainty in GIS. NCGIA Core Curriculum in GIScience. http://www.ncgia.ucsb.edu/education/curricula/giscc/units/u187/u187.html Last accessed 12.4.2000.

References

200

Hunter, G.J. and Goodchild, M.F., 1997, Modeling the uncertainty of slope and aspect estimates derived from spatial databases. Geographical Analysis, 29, 35-49. Hutchinson, M.F., 1988. Calculation of hydrologically sound digital elevation models. Proceedings, 3rd International Symposium on Spatial Data Handling, International Geographical Union, Sydney, Australia, 117-133. Hutchinson, M.F., 1989, A new procedure for gridding elevation and stream line data with automatic removal of spurious pits. Journal of Hydrology, 106, 211-232. Jenson, S.K. & Domingue, J.O., 1988, Extracting topographic structure from digital elevation data for geographic information system analysis. Photogrammetric Engineering and Remote Sensing, 54(11), 1593-1600. Jones, K.H., 1998, A comparison of algorithms used to compute hill slope as a property of the DEM. Computers and Geosciences, 24(4), 315-323. Kennie, T.J.M., 1990, Field data collection for terrain modelling. In: Petrie, G. & T.J.M. Kennie (Ed.s), Terrain Modelling in Surveying and Civil Engineering. Whittles, London, pp. 4 –16. Köstli, A. & Wild, E., 1984, A digital elevation model featuring varying grid size. International Archives of Photogrammetry and Remote Sensing, 25(A36), 11301138. Kroll, C. and Stedinger, J., 1996, Estimation of moments and quantiles using censored data. Water Resources Research, 32(4), 1005-1012. Kumar, L., Skidmore, A.K. & Knowles, E., 1997, Modelling topographic variation in solar radiation in a GIS environment. International Journal of Geographical Information Science, 11, 475-498. Kumler, M.P., 1994. An intensive comparison of Triangulated Irregular Networks (TIN) and Digital Elevation Models (DEM). Cartographica, 31(2), Monograph 45. Kyriakidis, P.C., Shortridge, A.M. and Goodchild, M.F., 1999, Geostatistics for conflation and accuracy assessment of digital elevation models. International Journal of Geographical Information Science, 13(7), 677-707. Lam, N.S., 1983, Spatial Interpolation Methods: a review. The American Cartographer, 10(2), 129-149. Li, Z., 1988, On the Measure of Digital Terrain Model Accuracy. Photogrammetric Record, 12(72), 873-877. Li, Z., 1991, Effects of checkpoints on the reliability of DTM accuracy estimates obtained from experimental tests. Photogrammetric Engineering & Remote Sensing, 57(10), 1333-1340. Li, Z., 1993a, Theoretical models of the accuracy of digital terrain models: an evaluation and some observations. Photogrammetric Record, 14(82), 651-659.

References

201

Li, Z., 1993b, Mathematical models of the accuracy of digital terrain model surfaces linearly constructed from square gridded data. Photogrammetric Record, 14(82), 661-673. Light, D.L., 1993, The National Aerial Photograph Program as a geographic information system resource. Photogrammetric Engineering and Remote Sensing, 59(1), 61-65. Lillesand, T.M. & Kiefer, R.W., 2000, Remote Sensing and Image Interpretation. 4th Ed. John Wiley & Sons, New York. Longley, P.A., Goodchild, M.F., Maguire, D.J. & Rhind, D.W., 2001. Geographic Information Systems and Science. John Wiley & Sons, Chichester. Magellan, 1994, Magellan GPS ProMark X User Guide. Magellan Systems Corporation, California. Maidment, D.R., 1996, Environmental modelling with GIS. In: Goodchild, M.F., Steyaert, L.T., Parks, B.O., Johnston, C., Maidment, D., Crane, S. & Glendinning, S. (Ed.s), GIS and Environmental Modelling: Progress and Research Issues. GIS World Books, Fort Collins, Colorado, pp. 315-324. Makarovic, B., 1979, From progressive to composite sampling for digital terrain models. Geo-Processing, 1, 145-166. Makarovic, B., 1973, Progressive sampling methods for digital elevation models. ITC Journal, 3, 397-416. Mark, D.M., 1984, Automated detection of drainage networks from digital elevation data. Cartographica, 21, 168-178. Martz, L.W. & Garbrecht, J., 1992, Numerical definition of drainage network and subcatchment areas from digital elevation models. Computers and Geosciences, 18(6), 747-761. Martz, L.W. & Garbrecht, J., 2000, The treatment of flat areas and depressions in automated drainage analysis of raster digital elevation models. In: Gurnell, A.M. & Montgomery, D.R. (Ed.s), Hydriological Applications of GIS, Wiley, Chichester, pp. 23-35. McDermid, G.J. & Franklin, S.E., 1995, Remote sensing and geomorphometric discrimination of slope processes. Zeitschrift für Geomorphologie. December 1995, 165-185. McLaren, R.A. & T.J.M. Kennie, 1989, Visualisation of digital terrain models: techniques and applications. In: Raper, J. (Ed.), Three Dimensional Applications in Geographical Information Systems. Taylor and Francis, London, pp. 70 – 98. Miller, D.R. and Morrice, J.G., 1996, Assessing uncertainty in catchment boundary delimitation. Proceedings of 3rd International conference on GIS and Environmental Modelling, Jan 1996, Santa Fe, New Mexico.

References

202

http://ncgia.ucsb.edu/conf/SANTA_FE_CD/papers/miller1_david/miller_paper1.html Last accessed 16.7.97. Miller, R.L. & Kahn, J.S., 1962, Statistical Analysis in the Geological Sciences. John Wiley, London. Mitasova, H., Hofierka, J., Zlocha, M. & Iverson, L.R., 1996, Modeling topographic potential for erosion and deposition using GIS. International Journal of Geographical Information Systems, 10(5), 629-641. Mitasova, H., Mitas, L., Brown, W.M., Gerdes, D.P., Kosinovsky, I. & Baker, T., 1995, Modelling spatially and temorally distributed phenomena: new methods and tools for GRASS GIS. International Journal of Geographical Information Systems, 9, 433-446. Monckton, C., 1994, An investigation into the spatial structure of error in digital elevation data. In: Worboys, M.F. (Ed.), Innovations in GIS 1. Taylor and Francis, pp. 201-211. Moore, I.D., Grayson, R.B. & Ladson, A.R., 1991, Digital terrain modelling: a review of hydrological, geomorphological, and biological applications. Hydrological Processes, 5, 3-30. National Geographic, 2001, MapMachine, National Geographic. http://plasma.nationalgeographic.com/mapmachine/index.html Last accessed 11.8.2001. Openshaw, S., Charlton, M. & Carver, S., 1991, Error propagation: a Monte Carlo simulation. In: Masser, I. & Blakemore, M. (Ed.s), Handling Geographical Information: methodology and potential applications. Longman Scientific and Technical, Harlow, pp. 78-101. Ordnance Survey, 1999, Land-Form PROFILE User Guide, Version 3.0 Data. http://www.ordsvy.gov.uk/downloads/height/profile/profil_w.pdf. Last accessed 10.8.2000. Ordnance Survey, 1992, 1:50 000 Scale Height Data User Manual. Ordnance Survey, Southampton. Petrie, G., 1990a, Photogrammetric methods of data acquisition for terrain modelling. In: Petrie, G. & Kennie, T.J.M. (Ed.s), Terrain Modelling in Surveying and Civil Engineering. Whittles Publishing, London, pp. 26-54. Petrie, G., 1990b, Terrain data acquisition and modelling from existing maps. In: Petrie, G. and Kennie, T.J.M. (Ed.s), Terrain Modeling in Surveying and Civil Engineering, Whittles Publishing Services, London, pp. 85-111. Peucker, T.K., 1980,. The impact of different mathematical approaches to contouring. Cartographica, monograph 25, 73-95. Pike, R.J., 1988, The geometric structure: quantifying landslide terrain types from digital elevation models. Mathematical Geology, 20(5), 491-511.

References

203

Price, M.F. & Heywood, D.I., 1994, Mountain Environments and Geographic Information Systems, Taylor and Francis, London. Rhind, D., 1975, A skeletal overview of interpolation techniques. Computer Applications, 2 (3&4), 293-309. Rieger, W., 2000. A phenomenon-based approach to upslope contributing area and depressions in DEMs. In: Gurnell, A.M. & Montgomery, D.R. (Ed.s), Hydrological Applications of GIS, Wiley, Chichester, pp. 37-52. Rossiter, D.G., 1995, Land evaluation. Part 5: risk and uncertainty. SCAS Teaching Series No. T94-1, Department of Soil, Crop and Atmospheric Sciences, Cornell University. http://wwwscas.cit.cornell.edu/landeval/le_notes/s494ch5p.htm Last accessed 02.08.01 Sasowsky, K.C., Petersen, G.W. & Evans, B.M., 1992, Accuracy of SPOT digital elevation model and derivatives: utility for Alasks‟s North Slope. Photogrammetric Engineering and Remote Sensing, 58(6), 815-824. Schumaker, L.L., 1976, Fitting surfaces to scattered data. In: Lorentz, G. (Ed.), Approximation Theory II, Academic Press, New York, pp. 203-267. Schut, G., 1976, Review of interpolation methods for digital terrain models. The Canadian Surveyor, 30, 389-412. Shearer, J.W., 1990, The accuracy of digital terrain models. In: Petrie, G. & Kennie, T.J.M. (Ed.s), Terrain Modelling in Surveying and Civil Engineering, Whittles Publishing, London, pp. 315-336. Skidmore, A.K., 1989, A comparison of techniques for calculating gradient and asoect from a gridded digital elevation model. International Journal of Geographical Information Systems, 3, 323-334. SPSS, 1999, SPSS version 9 Online Help. Star, J. & Estes, J., 1990, Geographical Information Systems: An Introduction. Prentice Hall, Englewood Cliffs. Stocks, A.M. & Heywood, D.I., 1994, Terrain modelling for mountains. In: Price, M.F. & Heywood, D.I. (Ed.s), Mountain Environments and Geographic Information Systems, Taylor and Francis, London, pp. 25 – 40. Theobald, D.M., 1989, Accuracy and bias issues in surface representation. In: Goodchild, M. & Gopal, S. (Ed.s), The Accuracy of Spatial Databases. Taylor & Francis, London, pp. 99 – 106. Tribe, A., 1992. Automated recognition of valley lines and drainage networks from grid digital elevation models: a review and a new method. Journal of Hydrology, 139, 263-293.

References

204

Tuladhar, A. & Makarovic, M.B., 1988, Upgrading DTMs from contour lines using photogrammetric selective sampling. ITC Journal, 1988, 339-344. UCGIS (University Consortium for Geographical Information Science), 1997, Uncertainty in geographic data and GIS-based analyses. UCGIS Research Priorities: Paper 9. http://www.ncgia.ucsb.edu/other/ucgis/research_priorities/paper9.html Last accessed 02.08.01 USGS, 1998, National Mapping Program Technical Instructions – Standards for Digital Elevation Models Part 2: Specifications. http://rockyweb.cr.usgs.gov/nmpstds/acrodocs/dem/PDEM0198.PDF Last accessed 10.8.2000. Veregin, H., 1994. Integration of simulation modelling and error propagation for the buffer operation in GIS. Photogrammetric Engineering and Remote Sensing, 60(4), 427-435. Weibel, R. & Brändli, M., 1995, Adaptive methods for the refinement of digital terrain models for geomorphometric applications. Zeitschrift für Geomorphologie. December 1995, 13-30. Weibel, R. and DeLotto, J.S., 1988, Automated terrain classification for GIS modelling. Proceedings of GIS/LIS ’88, San Antonio, pp. 618-627. Weibel, R. & Heller, M., 1991, Digital terrain modeling. In: Maguire, D.J., Goodchild, M.F. & Rhind, D.W. (Ed.s), Geographical Information Systems Volume 1: principles, Longman, London, pp. 269-297. Wood, J.D., 1993, Measuring and Reporting the Accuracy of Ordnance Survey Digital Elevation Data. Ordnance Survey Technical Report, Ordnance Survey, Southampton. Wood, J.D., 1994, Visualising contour interpolation accuracy in digital elevation models. In: Hearnshaw, H.M. & Unwin, D.J. (Ed.s), Visualization in Geographical Information Systems. John Wiley & Sons, Chichester, pp. 168-180. Wood, J.D., 1996, The Geomorphological Characterisation of Digital Elevation Models. Unpublished PhD Thesis, University of Leicester. http://www.soi.city.ac.uk/~jwo/phd/ Last accessed 18.7.01 Yoeli, P., 1986., Computer executed production of a regular grid of height points from digital contours. The American Cartographer, 13(3), 219-229. Zevenbergen, L.W. & Thorne, C.R., 1987, Quantitative analysis of land surface topography. Earth Surface Processes and Landforms, 12, 47-56. Zhang, W. & Montgomery, D.R., 1994, Digital elevation model grid size, landscape representation and hydrologic simulations. Water Resources Research, 30, 10191028.

References

205

Appendix 1: Kriging Trial A brief trial was undertaken to investigate whether interpolation with Kriging produced a DEM that was of better quality than other interpolation methods. Ordinary Kriging with a Gaussian semi-variogram model was used to create okgauss12. This DEM was compared to two DEMs produced using inverse distance weighting: idw112 and idw212. Geomorphological indices and accuracy measures were calculated for all three DEMs (Table A1.1 and Table A1.2). Table A1.1 Kriging geomorphometric indices DEM Contour Flatness Blockiness bias index index okgauss12 11.82 2.11 0.94 idw112 13.30 1.82 1.59 idw212 23.19 1.91 1.36 Table A1.2 Kriging accuracy measures Mean ADM Standard DEM deviation okgauss12 -2.68 3.84 5.24 idw112 -2.72 3.74 5.11 idw212 -2.76 3.13 4.45

Pit volume 3694 15467 23326

90th percentile

Reliability

4.93 8.11 7.19

2 1 2

The okgauss12 DEM has better contour bias and blockiness indices and the pit volume is much lower. The flatness index is slightly higher. Overall the geomorphometric quality of the Kriging DEM is higher. The mean error and 90th percentile are lower for okgauss12. The average deviation from the mean (ADM) and standard deviation are higher. Reliability is the same as idw212 and lower than idw112. Overall the accuracy of the Kriging DEM is slightly lower.

These findings indicate that the Kriging DEM is of similar quality to the inverse distance weighted DEMs. There is no clear benefit to undertaking the additional effort involved in interpolating a DEM using Kriging.

Appendix 1: Kriging Trial

206

Appendix 2: The DEM Uncertainty and Quality ArcView Extension The DEM Uncertainty and Quality extension to ArcView (“DEMUncQual.avx”; Appendix 9) adds a menu to ArcView‟s View interface. By selecting items in this menu ArcView scripts are run, which automate processing steps that allow the assessment of a DEM‟s quality and modelling of uncertainty associated with DEM derivatives.

The DEM Uncertainty and Quality menu is shown in Fig. A2.1.

Fig. A2.1 The DEM Uncertainty and Quality menu. Table A2.1 lists the scripts associated with each menu item. Appendix 9 gives the ArcView Avenue code of these scripts.

Appendix 2: The DEM Uncertainty and Quality ArcView Extension

207

Table A2.1 Scripts associated with DEM Uncertainty and Quality menu items Menu item Script Calculate geomorphometric quality indices DEMUncQual_GeoIndices.ave Calculate accuracy measures DEMUncQual_AccMeasures.ave Calculate DEM errors and derive terrain DEMUncQual_TPs.ave parameters Create SD surface DEMUncQual_Esurf.ave Determine number of simulations (one DEMUncQual_NTestOne.ave value) Determine number of simulations (range of DEMUncQual_NTestMany.ave values) Make random grids DEMUncQual_MakeRandomGrids.ave Monte Carlo simulation DEMUncQual_MonteCarlo.ave On selecting a menu item a series of dialogue boxes are displayed for choosing data sets and specifying the necessary parameters, then the script is run. An explanation of how to use the DEM Uncertainty and Quality extension is given below in §A2.1 to §A2.9. Flow diagrams for the main scripts are given in §A2.10 to §A2.17.

A2.1 Installing and Loading the Extension The DEM Uncertainty and Quality extension requires that the Spatial Analyst (“spatial.avx”) and Hydrological Modelling (“hydrov11.avx”) extensions are installed. These extensions are available from ESRI. Copy the “DEMUncQual.avx” file to the Ext32 folder within the ArcView installation folder: o The exact location of the folder will vary depending on your installation of ArcView, but will be similar to c:\esri\av_gis30\arcview\ext32 Start ArcView Select File – Extensions In the list of available extensions in the extension dialogue, click on the square to the left of DEM Uncertainty and Quality Modelling. Click OK The Surface, Hydro and DEMUncQual menus are added to the View interface.

A2.2 Calculating Geomorphometric Quality Indices Add the DEM to be assessed as a theme to the View, if it is not already loaded. Select DEMUncQual – Calculate geomorphometric quality indices. Appendix 2: The DEM Uncertainty and Quality ArcView Extension

208

Select the DEM to be assessed from the list of available grid themes. Click OK. Enter the contour interval of the source data from which the DEM was created. Click OK. Click YES or NO to specify whether there is a pit-filled version of the DEM present in the View. o If YES to the above, select the pit-filled DEM. o If NO to the above, a pit-filled version of the DEM is created. This may take a few minutes. N.B. This pit-filled version of the DEM is not saved. Use Hydro – Fill sinks to create a pit-filled DEM that can be saved. The geomorphometric indices are calculated – this may take a few minutes, depending on the size of the DEM and the processing speed of the computer. A message box reports the contour bias, flatness index and pit volume of the DEM. Click OK to close the message box and end the script. The results are written to the comments section of the DEM theme‟s properties. Select Theme – Properties to view the DEM theme‟s properties.

A2.3 Calculating Accuracy Measures Add the DEM to be assessed as a theme to the View, if it is not already loaded. Add a control point shapefile as a theme to the View, if it is not already loaded. This theme should contain control points representing locations where elevation is measured at a higher degree of accuracy. The theme‟s attribute table should contain a field storing the higher accuracy elevation values. Select DEMUncQual – Calculate accuracy measures. Select the DEM to be assessed from the list of available grid themes. Click OK. Select the control point theme from the list of available shapefile themes. Click OK. Select the field storing higher accuracy elevation values from the list of the control point theme‟s attribute table fields. Click OK. The accuracy measures are calculated – this may take a few minutes, depending on the size of the DEM and the processing speed of the computer. A message box reports the mean error, standard deviation of error and reliability for the DEM. Click OK to close the message box and end the script. The results


209

are written to the comments section of the DEM theme‟s properties. Select Theme – Properties to view the DEM theme‟s properties.

A2.4 Calculating DEM Errors and Deriving Terrain Parameters Add the DEM to be assessed as a theme to the View, if it is not already loaded. Add a control point shapefile as a theme to the View, if it is not already loaded. This theme should contain control points representing locations where elevation is measured at a higher degree of accuracy. The theme‟s attribute table should contain a field storing the higher accuracy elevation values. Select DEMUncQual – Calculate DEM errors and derive terrain parameters. Select the DEM to be assessed from the list of available grid themes. Click OK. Select the control point theme from the list of available shapefile themes. Click OK. Select the field storing higher accuracy elevation values from the list of the control point theme‟s attribute table fields. Click OK. Click YES or NO to specify whether all 74 terrain parameters should be calculated. o If NO to the above, click YES or NO in a series of 13 dialogue boxes to specify which terrain parameters should be calculated. Specify the filename of the output shapefile. Click OK. The DEM errors and terrain parameters are calculated – this can take half an hour or more, depending on the number of control points, the processing speed of the computer and the number of terrain parameters that were specified. The output shapefile‟s attribute table is displayed. Each field in this table stores a terrain parameters value for each control point. The attribute table can be analysed in a statistics package to identify the relationship between the DEM‟s error and the terrain parameters.

A2.5 Creating a Standard Deviation Surface Add the DEM to be assessed as a theme to the View, if it is not already loaded. Select DEMUncQual – Create SD surface. Select the DEM to be assessed from the list of available grid themes. Click OK.


210

A message box gives the current working directory. The working directory is where output grids will be saved. Click YES or NO to specify whether a different working directory is required. o If YES to the above, type in the path name for the new working directory. Click OK. Enter the number of variables in the DEM error / terrain parameter regression equation. Click OK. Enter the regression equation‟s constant. Click OK. For each variable in the regression equation: o Choose the type of variable. Click OK. o Specify the specific variable. Click OK. o Enter the coefficient. Click OK. The DEM error and accuracy surfaces are calculated – this can take up to an hour or more, depending on the size of the DEM, the processing speed of the computer and the number and type of terrain parameters that were specified. Four grids representing error, mean filtered error, standard deviation of error and standard deviation of mean filtered error are added as themes in the View. Each of these grids is saved to the workspace.

A2.6 Determining Number of Monte Carlo Simulations (one value) Add the DEM to be assessed as a theme to the View, if it is not already loaded. Add the DEM‟s standard deviation of error grid as a theme to the View, if it is not already loaded. Select DEMUncQual – Determine number of simulations (one value). Select the DEM to be assessed from the list of available grid themes. Click OK. Enter the number of simulations for calculating realisation variability. Click OK. Select the standard deviation grid from the list of available grid themes. Click OK. The realisation variability is calculated – this can take up to an hour or more, depending on the size of the DEM, the processing speed of the computer and the number of realisations that were specified. A message box reports the standard deviation and variance of the perturbed DEM realisations. Click OK. Appendix 2: The DEM Uncertainty and Quality ArcView Extension

211

Click YES or NO to specify whether the results should be written to a file. o If YES to the above, enter a filename. Click OK.

A2.7 Determining Number of Monte Carlo Simulations (range of values) Add the DEM to be assessed as a theme to the View, if it is not already loaded. Add the DEM‟s standard deviation of error grid as a theme to the View, if it is not already loaded. Select DEMUncQual – Determine number of simulations (range of values). Click YES to the warning message about the length of the process. Select the DEM to be assessed from the list of available grid themes. Click OK. Enter the number of simulations for calculating realisation variability. Click OK. Enter the interval between realisation calculations (e.g. to calculate variability for 1, 11, and 21 realisations, specify 21 as the number of simulations and 10 as the interval). Click OK. Select the standard deviation grid from the list of available grid themes. Click OK. The realisation variability is calculated – this can take several hours, depending on the size of the DEM, the processing speed of the computer, the number of realisations and the calculation interval that were specified. A message box reports the standard deviation of the perturbed DEM realisations for each interval. Click OK. Click YES or NO to specify whether the results should be written to a file. o If YES to the above, enter a filename. Click OK.

A2.8 Making Random Grids Add the DEM to be assessed as a theme to the View, if it is not already loaded. Add the DEM‟s standard deviation of error grid as a theme to the View, if it is not already loaded. Select DEMUncQual – Make random grids. Select the DEM to be assessed from the list of available grid themes. Click OK. Select the standard deviation grid from the list of available grid themes. Click OK. Appendix 2: The DEM Uncertainty and Quality ArcView Extension

212

Enter the number of random grids to create. Click OK. Specify whether to use a normal or uniform random distribution. Click OK. The random grids are created – this can take up to half an hour or longer, depending on the size of the DEM, the processing speed of the computer and the number of random grids. The random grids are added as themes in the View. These random grids are temporary files. They will be deleted on closing ArcView unless a Project is saved.

A2.9 Monte Carlo Simulation Add the DEM to be assessed as a theme to the View, if it is not already loaded. Add the DEM‟s standard deviation of error grid as a theme to the View, if it is not already loaded. Add the random grids or perturbed DEMs as themes to the View, if they are not already loaded. Add a watershed source grid as a theme to the View, if watershed probability is to be calculated and if this grid is not already loaded. Make sure that Map Units are specified. Select View – Properties to set this value. Select DEMUncQual – Monte Carlo simulation. Click OK to the message outlining the process. Click YES to the message reminding you of the grid themes that are required to be loaded in the View. Enter the number of simulations. Click OK. Click YES or No to specify whether to use existing perturbed DEMs (YES) or to create perturbed DEMs from random grids (NO). o If YES to the above, select the perturbed DEMs from the list of available grid themes. Click OK. o If NO to the above, select the random grids from the list of available grid themes. Click OK. Select the DEM to be assessed from the list of available grid themes. Click OK. Select the standard deviation grid from the list of available grid themes. Click OK.


213

Click YES or NO in a series of 5 dialogue boxes to specify which DEM derivatives‟ uncertainty should be modelled. o If YES to watershed probability, select the watershed source grid from the list of available grid themes. Click OK. A message box gives the current working directory. The working directory is where output grids will be saved. Click YES or NO to specify whether a different working directory is required. o If YES to the above, type in the path name for the new working directory. Click OK. If perturbed DEMs are being created from random grids, click YES or NO to specify whether the perturbed DEMs should be saved for later use. Click YES or NO to specify whether results should be masked by a watershed boundary. o If YES to the above, select the shapefile containing a polygon representing the watershed boundary. Click YES or NO to specify whether map units are different to elevation units. o If YES to the above, specify the type of conversion. Click OK. Enter probability thresholds for calculation of t statistics. Click OK. If perturbed DEMs are being created from random grids, enter a value for spatial dependence (the size of filter window applied to random grids). Click OK. The simulation is run – this can take several hours, depending on the size of the DEM, the processing speed of the computer, the number of realisations and the number of DEM derivatives that were specified. A message box reports the time taken to run the scripts. Click OK. A new View is created, the output grid themes are loaded into this View and the project file is saved.


214

A2.10 DEMUncQual_GeoIndices Flow Diagram Choose DEM grid

User input

Enter contour interval Specify whether pit-filled version of the DEM is in the View

Optional user input Processing stage Optional processing stage

Yes/No

Select pit-filled DEM

Create pit-filled DEM

Calculate contour bias Calculate flatness index Calculate pit volume Save and display results


215

A2.11 DEMUncQual_AccMeasures Flow Diagram

Choose DEM grid

User input

Choose control point (CP) theme

Optional user input

Select CP theme‟s elevation field

Processing stage Optional processing stage

Make temporary table of DEM and control point elevations

Calculate mean error Calculate standard deviation Calculate reliability Save and display results


216

A2.12 DEMUncQual_TPs Schematic Choose DEM grid

User input

Choose control point (CP) theme

Optional user input

Select CP theme‟s elevation field

Processing stage Optional processing stage

Select parameters to calculate Specify output theme‟s name Create buffer around CPs Extract elevation Calculate errors Calculate elevation parameters Calculate degree gradient parameters Calculate percent gradient parameters Calculate aspect vector parameters Calculate overall curvature parameters Calculate plan curvature parameters Calculate profile curvature parameters Calculate mean extremity parameters Calculate maximum extremity parameters Calculate minimum extremity parameters Calculate relative relief parameters Calculate degree texture parameters Calculate percent texture parameters Display output table


217

A2.13 DEMUncQual_Esurf Schematic Choose DEM grid Set working directory Enter number of variables

User input Optional user input Processing stage Optional processing stage

Enter constant Choose variables and enter their coefficients

Calculate each variable grid and multiply by coefficient

Error grid = sum of variable grids

Apply mean filter to error grid and calculate standard deviation of error grids

Save and display results grids


218

A2.14 DEMUncQual_NTestOne Schematic Choose DEM grid Enter number of simulations (N) Choose standard deviation grid


Create N perturbed DEMs

Calculate standard deviation grid (standard deviation at each cell location)

Calculate standard deviation of standard deviation grid

Report results Specify text file for saving results Save results


219

A2.15 DEMUncQual_NTestMany Schematic Choose DEM grid

User input

Enter number of simulations (N)

Optional user input Processing stage

Enter simulation interval (i)

Optional processing stage Choose standard deviation grid For n = 1 to N (increment = i)

Create n perturbed DEMs

Calculate standard deviation grid (standard deviation at each cell location)

Calculate standard deviation of standard deviation grid

Add result to results list Report results Specify text file for saving results Save results


220

A2.16 DEMUncQual_MakeRandomGrids Schematic Choose DEM grid Choose standard deviation grid Enter number of random grids to create (N)


Choose uniform or normal distribution of random values

Create N random grids

Save and display N random grids


221

A2.17 DEMUncQual_MonteCarlo Schematic Specify number of simulations (N)

User input

Choose whether to use existing perturbed DEMs

Optional user input Processing stage Optional processing stage

Yes/No Select N perturbed DEMs

Select N random grids Choose whether to save perturbed DEMs

Select DEM grid Select standard deviation grid Choose terrain parameters for modelling uncertainty Specify working directory

If modelling watershed uncertainty, select watershed source grid

Select mask polygon Specify t statistic thresholds Specify filter size for smoothing

Call DEMUncQual_SimAsp_residual Call DEMUncQual_SimFld_residual Call DEMUncQual_SimWS Call DEMUncQual_SimUps_residual Call DEMUncQual_SimSlp_residual Call DEMUncQual_SimLN_residual Call DEMUncQual_SimErr_residual

Call DEMUncQual_CalcSD and/or DEMUncQual_tCritical

Filter random grids, add to DEM to create N perturbed DEMs

Save result grids Display result grids


222


223


224

Appendix 4: Snowdon Correlations of Elevation Error with Initial 13 Terrain Parameters Parameter SpTen12

IDW512

IDW112

Z -0.073 -0.212(*) -0.234(*) GD -0.068 0.06 -0.07 AX -0.164 0.023 0.022 AY 0.067 0.092 -0.079 C 0.222(*) 0.104 0.005 CH 0.269(**) -0.076 -0.032 CV -0.093 -0.147 -0.03 AVEX10 0.162 0.156 -0.13 MAXEX10 0.113 0.168 -0.077 MINEX10 -0.066 0.151 -0.101 REL10 -0.097 0.001 -0.017 TEXT10 -0.137 0.031 -0.026 PTDIST 0.041 0.094 0.185 Correlation is significant at the 0.05 level (2* tailed). Correlation is significant at the 0.01 level (2** tailed).

Z = elevation GD = gradient measured in degrees AX = aspect vector X AY = aspect vector Y C = overall curvature CH = plan (horizontal) curvature CV = profile (vertical) curvature AVEX = average extremity MAXEX = maximum extremity MINEX = maximum extremity REL = relative relief TEXT = texture measured in degrees PTDIST = distance to nearest contour vertex 10 indicates the radius of neighbourhood parameters in number of cell

Appendix 4: Snowdon Correlation Tables

224

Appendix 5: Snowdon Correlations of Elevation Error with All 225 Terrain Parameters Parameter

SpTen12

IDW512

IDW112

Z Z2 Z3 Z_AV5 Z_AV52 Z_AV53 Z_AV10 Z_AV102 Z_AV103 Z_AV20 Z_AV202 Z_AV203 Z_SD5 Z_SD52 Z_SD53 Z_SD10 Z_SD102 Z_SD103 Z_SD20 Z_SD202 Z_SD203 GD GD2 GD3 GD_AV5 GD_AV52 GD_AV53 GD_AV10 GD_AV102 GD_AV103 GD_AV20 GD_AV202 GD_AV203 GD_SD5 GD_SD52 GD_SD53 GD_SD10 GD_SD102 GD_SD103 GD_SD20 GD_SD202 GD_SD203 GP GP2

-0.073 -0.108 -0.128 -0.074 -0.108 -0.128 -0.074 -0.108 -0.128 -0.074 -0.108 -0.127 -0.08 -0.093 -0.092 -0.078 -0.085 -0.071 -0.096 -0.088 -0.055 -0.068 -0.087 -0.093 -0.063 -0.087 -0.098 -0.061 -0.082 -0.09 -0.061 -0.064 -0.051 0.015 0.016 0.013 -0.056 -0.041 -0.026 -0.287(**) -0.325(**) -0.335(**) -0.078 -0.082

-0.212(*) -0.253(**) -0.270(**) -0.213(*) -0.253(**) -0.270(**) -0.214(*) -0.253(**) -0.270(**) -0.214(*) -0.253(**) -0.270(**) -0.019 -0.062 -0.093 0.03 0.05 0.069 -0.179 -0.244(*) -0.281(**) 0.06 0.068 0.084 0.013 -0.039 -0.075 0.031 -0.019 -0.062 -0.003 -0.005 0.017 0.008 0.02 0.041 0.028 0.02 0.02 0 -0.032 -0.053 0.123 0.206(*)

-0.234(*) -0.275(**) -0.293(**) -0.234(*) -0.275(**) -0.293(**) -0.234(*) -0.275(**) -0.293(**) -0.232(*) -0.273(**) -0.292(**) 0.005 -0.008 -0.016 0.034 0.039 0.052 0.024 0.005 -0.006 -0.07 -0.101 -0.124 -0.009 -0.015 -0.015 -0.007 -0.012 -0.014 0.008 0.007 0.006 -0.049 -0.049 -0.041 -0.059 -0.072 -0.079 -0.074 -0.082 -0.085 -0.111 -0.166


Parameter GP3 GP_AV5 GP_AV52 GP_AV53 GP_AV10 GP_AV102 GP_AV103 GP_AV20 GP_AV202 GP_AV203 GP_SD5 GP_SD52 GP_SD53 GP_SD10 GP_SD102 GP_SD103 GP_SD20 GP_SD202 GP_SD203 AX AX2 AX3 AX_AV5 AX_AV52 AX_AV53 AX_AV10 AX_AV102 AX_AV103 AX_AV20 AX_AV202 AX_AV203 AX_SD5 AX_SD52 AX_SD53 AX_SD10 AX_SD102 AX_SD103 AX_SD20 AX_SD202 AX_SD203 AY AY2 AY3 AY_AV5

SpTen12 -0.064 -0.079 -0.097 -0.1 -0.077 -0.084 -0.068 -0.108 -0.102 -0.07 -0.03 0.004 0.033 -0.116 -0.091 -0.042 -0.349(**) -0.404(**) -0.409(**) -0.164 -0.199(*) -0.086 -0.181 -0.192(*) -0.114 -0.199(*) -0.163 -0.146 -0.169 -0.084 -0.081 0.017 -0.014 -0.045 -0.059 -0.053 -0.03 -0.179 -0.215(*) -0.234(*) 0.067 0.077 0.118 0.061

IDW512 0.267(**) -0.042 -0.132 -0.186 -0.026 -0.094 -0.141 -0.051 -0.037 0.002 -0.011 -0.009 0.004 -0.022 -0.03 -0.025 -0.1 -0.111 -0.098 0.023 0.025 0.116 0.019 -0.026 0.042 -0.004 0.03 0.065 -0.041 0.052 0.086 -0.011 0.003 0.027 -0.025 -0.037 -0.036 -0.017 -0.017 -0.004 0.092 0.051 0.15 0.038

IDW112 -0.208(*) -0.01 -0.007 0.003 -0.011 -0.01 -0.003 -0.009 -0.021 -0.03 -0.026 -0.007 0.013 -0.028 -0.023 -0.024 -0.063 -0.098 -0.133 0.022 -0.045 0.041 -0.006 0.026 0.042 -0.028 0.06 0.061 -0.015 0.094 0.141 -0.195(*) -0.221(*) -0.222(*) -0.183 -0.211(*) -0.217(*) -0.282(**) -0.334(**) -0.362(**) -0.079 -0.069 -0.05 -0.056

225

Parameter AY_AV52 AY_AV53 AY_AV10 AY_AV102 AY_AV103 AY_AV20 AY_AV202 AY_AV203 AY_SD5 AY_SD52 AY_SD53 AY_SD10 AY_SD102 AY_SD103 AY_SD20 AY_SD202 AY_SD203 C C2 C3 C_AV5 C_AV52 C_AV53 C_AV10 C_AV102 C_AV103 C_AV20 C_AV202 C_AV203 C_SD5 C_SD52 C_SD53 C_SD10 C_SD102 C_SD103 C_SD20 C_SD202 C_SD203 CH CH2 CH3 CH_AV5 CH_AV52 CH_AV53 CH_AV10 CH_AV102 CH_AV103 CH_AV20 CH_AV202 CH_AV203 CH_SD5

SpTen12 0.069 0.108 0.041 0.052 0.081 0.006 0.058 0.096 -0.053 0.005 0.044 -0.123 -0.043 0.022 -0.231(*) -0.143 -0.044 0.222(*) -0.235(*) 0.239(*) 0.163 -0.065 0.134 -0.034 -0.122 0.007 -0.112 -0.369(**) 0.172 0.009 -0.032 -0.073 -0.207(*) -0.320(**) -0.386(**) -0.315(**) -0.429(**) -0.484(**) 0.269(**) -0.212(*) 0.237(*) 0.291(**) -0.063 0.334(**) 0.112 -0.134 0.065 0.022 -0.213(*) -0.142 -0.07

IDW512 -0.013 0.024 0.035 0.006 0.002 -0.051 -0.003 0.031 -0.004 0.004 0.026 0.028 0.029 0.033 0.01 -0.005 -0.01 0.104 -0.086 -0.057 0.151 -0.18 -0.114 0.075 -0.063 0.011 -0.052 -0.273(**) 0.148 0.009 0.053 0.082 -0.103 -0.132 -0.141 -0.179 -0.205(*) -0.207(*) -0.076 -0.186 -0.226(*) -0.141 -0.213(*) -0.295(**) -0.101 0.302(**) -0.324(**) 0.112 -0.207(*) 0.126 -0.122

IDW112 0.038 0.045 -0.07 0.001 0.005 -0.068 0.022 0.036 -0.202(*) -0.236(*) -0.256(**) -0.213(*) -0.232(*) -0.227(*) -0.181 -0.15 -0.096 0.005 -0.346(**) -0.183 -0.135 -0.121 -0.125 -0.084 -0.059 0.07 -0.354(**) -0.308(**) -0.134 -0.067 -0.057 -0.034 0.011 0.045 0.081 0.018 0.039 0.062 -0.032 -0.308(**) -0.129 -0.134 -0.278(**) -0.201(*) -0.338(**) -0.289(**) -0.232(*) -0.564(**) -0.475(**) -0.354(**) -0.087


Parameter CH_SD52 CH_SD53 CH_SD10 CH_SD102 CH_SD103 CH_SD20 CH_SD202 CH_SD203 CV CV2 CV3 CV_AV5 CV_AV52 CV_AV53 CV_AV10 CV_AV102 CV_AV103 CV_AV20 CV_AV202 CV_AV203 CV_SD5 CV_SD52 CV_SD53 CV_SD10 CV_SD102 CV_SD103 CV_SD20 CV_SD202 CV_SD203 AVEX5 AVEX52 AVEX53 AVEX10 AVEX102 AVEX103 AVEX20 AVEX202 AVEX203 MAXEX5 MAXEX52 MAXEX53 MAXEX10 MAXEX102 MAXEX103 MAXEX20 MAXEX202 MAXEX203 MINEX5 MINEX52 MINEX53 MINEX10

SpTen12 -0.176 -0.216(*) -0.193(*) -0.312(**) -0.386(**) -0.243(*) -0.356(**) -0.423(**) -0.093 -0.205(*) -0.158 -0.016 -0.002 -0.063 0.119 -0.073 0.005 0.159 -0.399(**) -0.133 0.019 0.102 0.139 -0.198(*) -0.234(*) -0.241(*) -0.349(**) -0.444(**) -0.487(**) 0.207(*) -0.209(*) 0.251(**) 0.162 -0.106 0.135 0.022 -0.1 0.031 0.11 -0.127 0.128 0.113 -0.102 0.062 0.05 -0.018 -0.022 -0.083 -0.117 -0.14 -0.066

IDW512 -0.129 -0.11 -0.152 -0.197(*) -0.211(*) -0.185 -0.200(*) -0.194(*) -0.147 -0.028 -0.052 -0.236(*) -0.085 -0.071 -0.105 -0.08 -0.014 0.095 -0.264(**) -0.143 0.022 0.076 0.117 -0.088 -0.113 -0.116 -0.177 -0.205(*) -0.205(*) 0.12 -0.240(*) -0.216(*) 0.156 -0.193(*) -0.196(*) 0.073 -0.191(*) -0.143 0.163 -0.15 0.11 0.168 -0.142 0.1 0.01 0.021 -0.035 0.11 -0.041 -0.159 0.151

IDW112 -0.081 -0.061 -0.016 0.026 0.072 -0.025 -0.006 0.016 -0.03 -0.316(**) 0.162 0.043 0.021 0.078 -0.228(*) -0.136 -0.161 -0.165 -0.319(**) -0.195(*) -0.053 -0.05 -0.035 0.009 0.03 0.05 0.033 0.049 0.063 0.025 -0.183 -0.220(*) -0.13 -0.280(**) -0.306(**) -0.282(**) -0.323(**) -0.317(**) 0.041 -0.028 0.009 -0.077 0.141 -0.214(*) -0.161 0.206(*) -0.235(*) -0.041 -0.092 -0.121 -0.101 226

Parameter MINEX102 MINEX103 MINEX20 MINEX202 MINEX203 REL5 REL52 REL53 REL10 REL102 REL103 REL20 REL202 REL203 TEXT5 TEXT52 TEXT53 TEXT10 TEXT102 TEXT103 TEXT20 TEXT202 TEXT203 TEXTP5 TEXTP52 TEXTP53 TEXTP10 TEXTP102 TEXTP103 TEXTP20 TEXTP202 TEXTP203 PTDIST PTDIST2 PTDIST3 * **

SpTen12 IDW512 IDW112 -0.082 0.035 -0.183 -0.074 -0.027 -0.250(**) -0.165 -0.132 -0.177 -0.226(*) -0.224(*) -0.279(**) -0.241(*) -0.270(**) -0.341(**) -0.1 -0.031 -0.045 -0.129 -0.1 -0.062 -0.146 -0.134 -0.063 -0.097 0.001 -0.017 -0.111 -0.052 -0.03 -0.102 -0.082 -0.032 -0.131 -0.088 -0.016 -0.144 -0.095 -0.048 -0.127 -0.075 -0.074 0.007 0.033 -0.097 0.019 -0.003 -0.097 0.026 -0.026 -0.088 -0.137 0.031 -0.026 -0.12 -0.006 -0.02 -0.087 -0.028 -0.007 -0.288(**) -0.028 -0.022 -0.374(**) -0.076 -0.023 -0.436(**) -0.104 -0.022 -0.056 -0.011 -0.05 -0.026 -0.042 -0.025 0.008 -0.053 -0.001 -0.222(*) -0.031 0.028 -0.246(*) -0.028 0.071 -0.213(*) -0.016 0.102 -0.365(**) -0.19 -0.08 -0.452(**) -0.224(*) -0.129 -0.470(**) -0.231(*) -0.175 0.041 0.094 0.185 -0.001 0.039 0.114 -0.036 -0.01 0.066 Correlation is significant at the 0.05 level (2-tailed). Correlation is significant at the 0.01 level (2-tailed).


Z = elevation GD = gradient measured in degrees GP = gradient measured in percent AX = aspect vector X AY = aspect vector Y C = overall curvature CH = plan (horizontal) curvature CV = profile (vertical) curvature AVEX = average extremity MAXEX = maximum extremity MINEX = maximum extremity REL = relative relief TEXT = texture measured in degrees TEXTP = texture measured in percent PTDIST = distance to nearest contour vertex AV = neighbourhood average SD = neighbourhood standard deviation 5, 10 and 20 indicate the radius of neighbourhood parameters in number of cells 2 and 3 indicate the square and cube of a parameter respectively

227

Appendix 6: Mestersvig Correlations of Elevation Error with Initial 13 Terrain Parameters Parameter SpTen12

IDW512

IDW112

Z -.469(**) -.480(**) -.445(**) GD -0.173 -0.067 -.206(*) AX .400(**) .352(**) .324(**) AY -.211(*) -.292(**) -.300(**) C -0.024 .281(**) 0.176 CH -0.018 0.165 .282(**) CV 0.02 -.246(*) -0.043 AVEX10 .211(*) .324(**) 0.071 MAXEX10 .221(*) .271(**) 0.101 MINEX10 -0.139 -0.029 -0.093 REL10 -.197(*) -.200(*) -0.109 TEXT10 -0.048 -0.002 0.01 PTDIST 0.011 0.02 0.012 Correlation is significant at the 0.05 level (2* tailed). Correlation is significant at the 0.01 level (2** tailed).

Z = elevation GD = gradient measured in degrees AX = aspect vector X AY = aspect vector Y C = overall curvature CH = plan (horizontal) curvature CV = profile (vertical) curvature AVEX = average extremity MAXEX = maximum extremity MINEX = maximum extremity REL = relative relief TEXT = texture measured in degrees PTDIST = distance to nearest contour vertex 10 indicates the radius of neighbourhood parameters in number of cells

Appendix 6: Mestersvig Correlation Tables

228

Appendix 7: Mestersvig Correlations of Elevation Error with All 225 Terrain Parameters Parameter

SpTen12

IDW512

IDW112

Z Z2 Z3 Z_AV5 Z_AV52 Z_AV53 Z_AV10 Z_AV102 Z_AV103 Z_AV20 Z_AV202 Z_AV203 Z_SD5 Z_SD52 Z_SD53 Z_SD10 Z_SD102 Z_SD103 Z_SD20 Z_SD202 Z_SD203 GD GD2 GD3 GD_AV5 GD_AV52 GD_AV53 GD_AV10 GD_AV102 GD_AV103 GD_AV20 GD_AV202 GD_AV203 GD_SD5 GD_SD52 GD_SD53 GD_SD10 GD_SD102 GD_SD103 GD_SD20 GD_SD202 GD_SD203 GP GP2

-.469(**) -.565(**) -.572(**) -.470(**) -.567(**) -.574(**) -.472(**) -.569(**) -.575(**) -.474(**) -.573(**) -.581(**) -0.176 -.269(**) -.314(**) -0.164 -.261(**) -.310(**) -0.164 -.238(*) -.274(**) -0.173 -.260(**) -.303(**) -0.13 -.216(*) -.259(**) -0.155 -.242(*) -.292(**) -0.186 -.259(**) -.298(**) -0.049 -0.111 -0.135 -0.005 -0.018 -0.016 0.066 0.045 0.019 -.204(*) -.294(**)

-.480(**) -.579(**) -.583(**) -.486(**) -.585(**) -.588(**) -.488(**) -.587(**) -.590(**) -.493(**) -.592(**) -.597(**) -0.119 -.202(*) -.235(*) -0.009 -0.041 -0.074 -0.158 -.227(*) -.270(**) -0.067 -0.068 -0.091 -0.055 -0.052 -0.042 -0.158 -.291(**) -.366(**) -0.18 -.259(**) -.309(**) -0.166 -.276(**) -.318(**) 0.006 -0.027 -0.056 -0.088 -0.13 -0.165 -0.087 -0.128

-.445(**) -.553(**) -.561(**) -.447(**) -.555(**) -.564(**) -.447(**) -.555(**) -.564(**) -.447(**) -.557(**) -.568(**) -0.047 -0.134 -.209(*) -0.057 -0.146 -.219(*) -0.075 -0.168 -.231(*) -.206(*) -.320(**) -.359(**) -0.084 -0.172 -.236(*) -0.087 -0.185 -.255(**) -0.114 -0.188 -.237(*) -0.038 -0.11 -0.184 0.025 0.007 -0.009 0.04 0.038 0.033 -.249(*) -.339(**)


Parameter GP3 GP_AV5 GP_AV52 GP_AV53 GP_AV10 GP_AV102 GP_AV103 GP_AV20 GP_AV202 GP_AV203 GP_SD5 GP_SD52 GP_SD53 GP_SD10 GP_SD102 GP_SD103 GP_SD20 GP_SD202 GP_SD203 AX AX2 AX3 AX_AV5 AX_AV52 AX_AV53 AX_AV10 AX_AV102 AX_AV103 AX_AV20 AX_AV202 AX_AV203 AX_SD5 AX_SD52 AX_SD53 AX_SD10 AX_SD102 AX_SD103 AX_SD20 AX_SD202 AX_SD203 AY AY2 AY3 AY_AV5

SpTen12 -.325(**) -0.155 -.239(*) -.269(**) -0.179 -.264(**) -.302(**) -.204(*) -.273(**) -.300(**) -0.101 -0.19 -.236(*) -0.06 -0.089 -0.095 -0.002 -0.041 -0.076 .400(**) -.217(*) .458(**) .446(**) -.249(*) .603(**) .471(**) -.264(**) .613(**) .513(**) -0.125 .629(**) -0.052 -0.109 -0.127 -0.098 -0.161 -.199(*) -.292(**) -.370(**) -.398(**) -.211(*) -0.13 -.272(**) -.237(*)

IDW512 -0.183 -0.096 -0.122 -0.128 -0.192 -.317(**) -.374(**) -.205(*) -.280(**) -.315(**) -.206(*) -.295(**) -.312(**) -0.069 -0.136 -0.179 -0.141 -.205(*) -.245(*) .352(**) -0.071 .460(**) .286(**) -0.099 .447(**) .345(**) -.285(**) .521(**) .437(**) -0.148 .576(**) -0.178 -.263(**) -.308(**) -0.128 -0.157 -0.179 -.246(*) -.289(**) -.319(**) -.292(**) 0.004 -.238(*) -.216(*)

IDW112 -.335(**) -0.111 -.214(*) -.278(**) -0.112 -.220(*) -.290(**) -0.126 -.202(*) -.243(*) -0.113 -.234(*) -.316(**) -0.035 -0.1 -0.154 -0.01 -0.045 -0.073 .324(**) -.307(**) .416(**) .382(**) -.202(*) .525(**) .406(**) -.201(*) .559(**) .438(**) -0.125 .603(**) -0.02 -0.039 -0.044 -0.056 -0.098 -0.131 -0.159 -.219(*) -.256(**) -.300(**) -0.149 -.317(**) -.308(**)

229

Parameter AY_AV52 AY_AV53 AY_AV10 AY_AV102 AY_AV103 AY_AV20 AY_AV202 AY_AV203 AY_SD5 AY_SD52 AY_SD53 AY_SD10 AY_SD102 AY_SD103 AY_SD20 AY_SD202 AY_SD203 C C2 C3 C_AV5 C_AV52 C_AV53 C_AV10 C_AV102 C_AV103 C_AV20 C_AV202 C_AV203 C_SD5 C_SD52 C_SD53 C_SD10 C_SD102 C_SD103 C_SD20 C_SD202 C_SD203 CH CH2 CH3 CH_AV5 CH_AV52 CH_AV53 CH_AV10 CH_AV102 CH_AV103 CH_AV20 CH_AV202 CH_AV203 CH_SD5 CH_SD52

SpTen12 -0.078 -.262(**) -.293(**) -0.068 -.278(**) -.294(**) -0.057 -.212(*) 0.049 0.037 0.033 0.038 0.003 -0.021 -0.058 -0.147 -.204(*) -0.024 -0.142 0.013 0.185 -0.023 .219(*) 0.014 -.207(*) .200(*) -.387(**) -.355(**) -.400(**) -0.014 -0.026 -0.028 -0.028 -0.021 -0.011 0.006 0.013 0.021 -0.018 -.253(**) -0.161 .199(*) 0.045 0.132 0.023 -.212(*) .199(*) -.288(**) -.325(**) -.294(**) -0.035 -0.043

IDW512 0.057 -.198(*) -.311(**) -0.071 -.360(**) -.334(**) -0.047 -.262(**) -0.08 -0.167 -.212(*) 0.058 0.054 0.039 -0.009 -0.031 -0.056 .281(**) -.265(**) .266(**) 0.085 -0.154 0.063 0.101 -0.183 0.165 -.384(**) -.240(*) -.319(**) -.263(**) -.276(**) -.238(*) -.214(*) -.274(**) -.275(**) -.257(**) -.303(**) -.291(**) 0.165 -0.146 0.176 -0.075 0.034 0.062 0.099 -.341(**) .258(**) -.238(*) -.309(**) -.285(**) -.219(*) -.215(*)

IDW112 -0.007 -.292(**) -.349(**) -0.025 -.311(**) -.362(**) -0.035 -.246(*) 0.006 -0.003 -0.022 0.056 0.058 0.06 0.018 -0.019 -0.053 0.176 -.220(*) 0.086 0.006 0.067 0.057 -0.002 -.263(**) .196(*) -.360(**) -.268(**) -.350(**) -0.065 -0.134 -0.189 -0.084 -0.151 -.205(*) -0.055 -0.105 -0.144 .282(**) -.383(**) .341(**) -0.013 -0.045 0.043 0.009 -.286(**) 0.113 -.227(*) -.278(**) -.270(**) -0.06 -0.123


Parameter CH_SD53 CH_SD10 CH_SD102 CH_SD103 CH_SD20 CH_SD202 CH_SD203 CV CV2 CV3 CV_AV5 CV_AV52 CV_AV53 CV_AV10 CV_AV102 CV_AV103 CV_AV20 CV_AV202 CV_AV203 CV_SD5 CV_SD52 CV_SD53 CV_SD10 CV_SD102 CV_SD103 CV_SD20 CV_SD202 CV_SD203 AVEX5 AVEX52 AVEX53 AVEX10 AVEX102 AVEX103 AVEX20 AVEX202 AVEX203 MAXEX5 MAXEX52 MAXEX53 MAXEX10 MAXEX102 MAXEX103 MAXEX20 MAXEX202 MAXEX203 MINEX5 MINEX52 MINEX53 MINEX10 MINEX102 MINEX103

SpTen12 -0.031 -0.057 -0.062 -0.053 -0.026 -0.026 -0.02 0.02 -0.087 -0.038 -0.113 -0.043 -0.148 -0.001 -0.055 -0.016 .359(**) -0.029 .288(**) -0.009 -0.021 -0.029 -0.016 -0.005 0.004 0.011 0.015 0.02 .212(*) -0.12 .271(**) .211(*) -0.116 .289(**) 0.081 -0.18 0.187 .210(*) -.319(**) .354(**) .221(*) -.313(**) .348(**) 0.14 -.200(*) .245(*) -0.096 -0.158 -0.182 -0.139 -.223(*) -.257(**)

IDW512 -0.191 -.257(**) -.296(**) -.291(**) -.289(**) -.318(**) -.301(**) -.246(*) -.314(**) -.211(*) -0.126 -0.11 -0.136 -0.075 -0.023 0.013 .323(**) -0.135 .262(**) -.254(**) -.287(**) -.260(**) -.195(*) -.257(**) -.261(**) -.241(*) -.295(**) -.287(**) .365(**) -.269(**) .410(**) .324(**) -0.144 .354(**) .221(*) -.239(*) .334(**) .239(*) -.293(**) .304(**) .271(**) -.338(**) .342(**) 0.16 -.221(*) .268(**) 0.105 0.107 0.102 -0.029 -0.137 -.210(*)

IDW112 -0.172 -0.096 -0.172 -.230(*) -0.069 -0.123 -0.166 -0.043 -0.138 0.056 -0.017 0.094 -0.034 0.01 -0.056 -0.054 .386(**) 0.075 .301(**) -0.076 -0.156 -.219(*) -0.082 -0.144 -.198(*) -0.053 -0.102 -0.139 0.14 -.288(**) .304(**) 0.071 -.234(*) .273(**) -0.043 -.256(**) 0.103 0.132 -.221(*) .278(**) 0.101 -.195(*) .250(*) 0.028 -0.091 0.145 -0.061 -0.157 -.217(*) -0.093 -.199(*) -.259(**) 230

Parameter MINEX20 MINEX202 MINEX203 REL5 REL52 REL53 REL10 REL102 REL103 REL20 REL202 REL203 TEXT5 TEXT52 TEXT53 TEXT10 TEXT102 TEXT103 TEXT20 TEXT202 TEXT203 TEXTP5 TEXTP52 TEXTP53 TEXTP10 TEXTP102 TEXTP103 TEXTP20 TEXTP202 TEXTP203 PTDIST PTDIST2 PTDIST3 * **

SpTen12 IDW512 IDW112 -.205(*) -0.158 -0.177 -.272(**) -.251(*) -.265(**) -.284(**) -.247(*) -.295(**) -0.165 -0.096 -0.105 -.259(**) -0.126 -.203(*) -.300(**) -0.128 -.265(**) -.197(*) -.200(*) -0.109 -.281(**) -.309(**) -.218(*) -.318(**) -.355(**) -.287(**) -0.19 -0.182 -0.114 -.255(**) -.250(*) -0.188 -.282(**) -.276(**) -.222(*) -0.032 -0.071 -0.007 -0.098 -0.143 -0.056 -0.146 -.197(*) -0.102 -0.048 -0.002 0.01 -0.082 -0.032 -0.016 -0.106 -0.061 -0.05 -0.02 -0.003 0.006 -0.048 -0.017 -0.018 -0.072 -0.033 -0.043 -0.081 -0.153 -0.071 -0.167 -.247(*) -0.168 -.226(*) -.289(**) -.227(*) -0.086 -0.059 -0.047 -0.125 -0.121 -0.115 -0.143 -0.165 -0.176 -0.069 -0.047 -0.044 -0.108 -0.09 -0.095 -0.131 -0.13 -0.132 0.011 0.02 0.012 0.006 0.011 -0.004 0.008 0.013 -0.006 Correlation is significant at the 0.05 level (2-tailed). Correlation is significant at the 0.01 level (2-tailed).

AY = aspect vector Y C = overall curvature CH = plan (horizontal) curvature CV = profile (vertical) curvature AVEX = average extremity MAXEX = maximum extremity MINEX = maximum extremity REL = relative relief TEXT = texture measured in degrees TEXTP = texture measured in percent PTDIST = distance to nearest contour vertex AV = neighbourhood average SD = neighbourhood standard deviation 5, 10 and 20 indicate the radius of neighbourhood parameters in number of cells 2 and 3 indicate the square and cube of a parameter respectively

Z = elevation GD = gradient measured in degrees GP = gradient measured in percent AX = aspect vector X Appendix 7: Mestersvig Correlation Tables

231

Appendix 8: DEM Quality Report Filename: c:\bruce\phd\dems\spten12

Date created: 10/10/01

Created by: B.Carlisle

DESCRIPTION DEM of the Mestersvig study area

GEOREFERENCING User defined Coord. System: Min X:

7750

Columns:

Max X: 31250

2350

WGS84

Datum:

Rows:

5350

Min Y: 1810

Max Y: 23450 Resolution: 10m

SOURCE DATA Contour lines digitised from 1:15,000 Mylar contour maps =150m: 50m contour interval 120m contour also digitised Digitised by B. Carlisle, 1998. Mylar maps produced by GEUS, Denmark, 1958 from stereo aerial photos. Digitised contour lines generalised using Douglas-Peucker algorithm with a 20m tolerance band INTERPOLATION ArcView 3.2 Software: Technique:

Parameters:

Weight: 0.1 Search radius: 12 points

Spline with tension

QUALITY DESCRIPTION All major landforms represented. A highly (unrealistically?) smoothed surface. Visible pits. GEOMORPHOLOGICAL INDICES 24.45% Contour bias:

ACCURACY MEASURES 4.90m Mean error:

Flatness index:

14.33%

Std. Dev. of error:

32.41m

Pit volume:

245,655 m3

Reliability:

4

ACCURACY SURFACE Filename: c:\bruce\phd\accsurfs\sd20_spten12 Error surface equation: VAR. COEFF. Constant -1.490 Ax_AV203 1642.690 Z_AV202 -0.000749 Ax_SD103 -2700.555 MinEx103 0.0004046 MaxEx10 0.813 Ay_SD20 115.529 Mean error filter window:

Date created: 25/10/01

VAR. Ch_AV52 MinEx203 Rel20 MinEx202 Ay_SD53 AvEx53 Cv 0m

Appendix 8: DEM Quality Report

COEFF. 114.744 -0.000137 0.312 0.01346 -3617.705 0.156 3.914

SD filter window:

Created by: B.Carlisle

VAR. Z_AV203 TextP52 Cv_AV10 Z_AV20 TextP20 AvEx52 Ch_SD203

COEFF. 0.0000005043 0.002861 41.143 0.156 -0.275 -0.558 0.611

20m

232

Appendix 9: CD ROM The enclosed CD-ROM contains: a digital version of this thesis (in Microsoft Word format) the DEM Uncertainty and Quality ArcView extension – DEMUncQual.avx copies of all the ArcView scripts accessed by DEMUncQual.avx (in Microsoft Word format) A “BruceCarlislePhD” folder contains 3 sub-folders: Thesis, Appendices and Scripts. The contents of these three folders are:

Thesis BruceCarlisleThesis.doc - Word master document linking the following subdocuments: Abstract.doc Acknowledgements.doc Contents.doc List of Figures.doc List of Tables.doc Chapter 1- Introduction.doc Chapter 2 – Background.doc Chapter 3 – Assessing DEM Quality.doc Chapter 4 – Modelling the Spatial Distribution of Error.doc Chapter 5 – Applying Knowledge of DEM Accuracy to Uncertainty Modelling.doc Chapter 6 – Summary and Further Work.doc References.doc

Appendices Appendices.doc – Word master document linking the following sub-documents: Appendix 1 – Kriging Trial.doc Appendix 2 – DEMUncQual.doc Appendix 3 – Mestersvig Map.doc

Appendix 9: CD ROM

233

Appendix 4 – Snowdon 12 Correlations.doc Appendix 5 – Snowdon 225 Correlations.doc Appendix 6 – Mestersvig 12 Correlations.doc Appendix 7 – Mestersvig 225 Correlations.doc Appendix 8 – Quality Report.doc Appendix 9 – CDROM.doc

Scripts DEMUncQual.avx – the ArcView extension DEMUncQual_GeoIndices.doc DEMUncQual_AccMeasures.doc DEMUncQual_TPs.doc DEMUncQual_Esurf.doc DEMUncQual_NTestOne.doc DEMUncQual_NTestMany.doc DEMUncQual_MakeRandomGrids.doc DEMUncQual_MonteCarlo.doc DEMUncQual_CalcSD.doc DEMUncQual_DEMFill.doc DEMUncQual_SimAsp.doc DEMUncQual_SimErr.doc DEMUncQual_SimFld.doc DEMUncQual_SimLN.doc DEMUncQual_SimSlp.doc DEMUncQual_SimUPS.doc DEMUncQual_SimWS.doc DEMUncQual_tCritical.doc

Appendix 9: CD ROM

234

Appendix 1: Kriging Trial

A235