int. j. geographical information science, 2002 vol. 16, no. 8, 819–842
Research Article Error assessment of grid-based ow routing algorithms used in hydrological models QIMING ZHOU* and XUEJUN LIU*§† *Department of Geography, Hong Kong Baptist University, Kowloon Tong, Kowloon, Hong Kong; e-mail:
[email protected] § Department of Highway and Bridge, Changsha Comunications Institute, Changsha Hunan 410076, China †National Laboratory of Information Engineering Surveying, Wuhan University, Wuhan, Hubei 430079 China (Received 6 March 2001; accepted 10 December 2001) Abstract. This paper reports an investigation on the accuracy of grid-based routing algorithms used in hydrological models. A quantitative methodology has been developed for objective and data-independent assessment of errors generated from the algorithms that extract hydrological parameters from gridded DEM. The generic approach is to use arti cial surfaces that can be described by a mathematical model, thus the ‘true’ output value can be pre-determined to avoid uncertainty caused by uncontrollable data errors. Four mathematical surfaces based on an ellipsoid (representing convex slopes), an inverse ellipsoid (representing concave slopes), saddle and plane were generated and the theoretical ‘true’ value of the Speci c Catchment Area (SCA) at any given point on the surfaces could be computed using mathematical inference. Based on these models, tests were made on a number of algorithms for SCA computation. The actual output values from these algorithms on the convex, concave, saddle and plane surfaces were compared with the theoretical ‘true’ values, and the errors were then analysed statistically. The strengths and weaknesses of the selected algorithms are also discussed.
1. Introduction One important study eld on implementing Digital Terrain Analysis (DTA) in a Geographical Information System (GIS) is the digital hydrological models and algorithms, which describe the mass (e.g. water, sediments and nutrient ) transportation and ow between land units. These models and algorithms have been developed to compute, for example, Total Contributing Area (TCA) (Freeman 1991, Moore et al. 1994), Speci c Catchment Area (SCA) (Costa-Cabral and Burges 1994, Gallant and Wilson 1996), Topographic Index (also known as ln(a/tanb) index, Quinn et al. 1991, 1995), and Stream Power Index (SPI) (Moore et al. 1994, Wilson and Gallant 2000). The results from the models form an important input to, for example, the development of soil erosion models (Dietrich et al. 1993), landuse and land evaluation (Desmet and Govers 1996, Stephen and Irvin 2000), landslide prediction (Duan and International Journal of Geographical Information Science ISSN 1365-8816 print/ISSN 1362-3087 online © 2002 Taylor & Francis Ltd http://www.tandf.co.uk/journals DOI: 10.1080/13658810210149425
820
Q. Zhou and X. L iu
Grant 2000), and catchment and drainage network analysis (Quinn et al. 1991, Tribe 1992). The most critical spatial data required for the digital hydrological modelling are the surface elevation, which is modelled in GIS as discrete samples for the speci c land surface. Although there have been numerous models developed for this purpose, such as Triangulated Irregular Network (TIN ), TOPOG model (O’Loughlin 1986) and digital contours, the grid Digital Elevation Model (DEM) has been the most commonly used data source for DTA because of its simple structure, ease of computation and compatibility with remotely sensed and digital photogrammetry data (Gao 1998, Tang 2000). Numerous grid DEM-based hydrological modelling algorithms have been developed (O’Callaghan and Mark 1984, Freeman 1991, Quinn et al. 1991, Fair eld and Leymarie 1991, Costa-Cabral and Burges 1994, Meisels et al. 1995, Tarboton 1997, Mackay and Band 1998, Liang and Mackay 2000). However, since a grid DEM itself is an approximation, which is restricted by, for example, scale and measurement errors, to the real-world continuous surface using regularly spaced samples, the implementation of terrain analysis models on it, such as slope, aspect, curvature and ow routing algorithms, is also represented as an approximation to the reality. In practice, numerous assumptions and optimisation about the mass transportation and movement on a speci c local surface (often represented as a 3×3 window) have to be made in order to establish a workable mathematical model (Holmgren 1994), resulting in signi cantly diVerent approaches and methodologies towards terrain analysis modelling (Pilesjo¨ and Zhou, 1996). It has been recognised that such modelling algorithms employed in many of today’s GIS software may produce poor results that lead to some signi cant arti cial facts (known as ‘artifacts’) in the output. For topographical parameters derived from a grid DEM, there appears to be three major sources of error, namely: 1. Data accuracy of DEM itself including sample errors, interpolation errors and representation errors (Bolstad and Lillesand 1994, Florinsky 1998a, Walker and Willgoose 1999, Tang 2000). 2. Spatial data structure of DEM such as data precision, grid resolution and orientation (Zhang and Montgomery 1994, Gao 1998). 3. Mathematical models and algorithms employed (Holmgren 1994, Wolock and McCabe 1995, Desmet and Govers 1996). The focus of this study is on the third, i.e. the error generated from the mathematical models and algorithms. To further investigate the issues of accuracy and correctness of the hydrological modelling algorithms, it is fundamental to establish a methodology for quantitative and objective measurements of errors generated by diVerent methods and algorithms (Zhou 2000). In the past, the most commonly used method was to apply proposed algorithms on a ‘real-world’ DEM. The results from the simulation were then visually or statistically examined against ‘common knowledge’ on what should be expected or digital data derived from cartographic resources (maps or aerial photographs) (Band 1986, Quinn et al. 1991, Chairat and Delleur 1993, Garbrecht and Martz 1994, Worlock and Price 1994, Zhang and Montgomery 1994, Wolock and McCabe 1995, Desmet and Govers 1996, Mendicino and Soil 1997, Tang 2000, Liang and Mackay 2000). This approach, however, is challenged by the fact that no real world
Error assessment of grid-based ow routing algorithms
821
DEM is perfect so that errors inherent in the DEM largely remain unknown. The conclusion of the assessment therefore can always be questioned because of the uncertainty of the data. From the point of view of error assessment, there are at least three shortfalls in this evaluation method: 1. The evaluation results are related to the actual topography. The results derived from one particular landform may be unsuitable for the others. 2. The accuracy of a DEM is often diYcult to quantify. Thus, the uncertainty in the DEM itself often masks the inherent errors of the models and algorithms, resulting in improper, sometimes controversial, assessment and analytical results (Florinsky 1998a, 1998b). 3. The comparison between diVerent models and algorithms on a real-world DEM can only present their relative accuracy and error distribution (should be termed as ‘diVerence distribution’). Since the ‘true’ value is unknown, it is uncertain and subjective to judge the correctness and quality of the models. Freeman (1991) presented an alternative method that used an arti cial surface, a cone, to assess divergent ow simulation algorithms. The results were assessed using the pattern of catchment area that follows a theoretical expectation. Although it is argued that whether a cone surface is realistic to the real-world cases, this method has eliminated the uncertainty of the data so that it made more convincing testing results. However, a methodology with authority to assess hydrological models requires that a set of ‘standard’ tests be established so that diVerent algorithms and models can be quantitatively compared. The aim of this study, therefore, is to develop a quantitative, data-independent method for analysing and objectively assessing errors generated from hydrological modelling algorithms. The generic approach is to use arti cial surfaces which can be described by a mathematical model, thus the ‘true’ output value can be predetermined to avoid uncertainty caused by uncontrollable data errors (Jones 1998). Four mathematical surfaces based on an ellipsoid (representing convex slopes), an inverse ellipsoid (representing concave slopes), saddle and plane have been generated and the theoretical ‘true’ value of the SCA on any given point on the surfaces has been computed using mathematical inference. Based on these models, we tested and compared a number of hydrological ow routing algorithms (Desmet and Govers 1996) for SCA computation using arti cial mathematical surfaces. The actual output values from these ow routing algorithms on the convex, concave, saddle and plane surfaces were compared with the theoretical results. The errors generated by these models were then statistically analysed and their strengths and weaknesses, as well as application suitability, are also discussed. 2. The scope and approach The scope of this study is to investigate the methodology in assessing errors in hydrological modelling algorithms. For this purpose, we have restricted our focus on the follows: E
E
Data: grid DEM, Algorithms: the ow routing algorithms which are employed for computing topographic parameters for simulating surface ow and other geomorphologic processes, and
822 E
Q. Zhou and X. L iu Applications: computation of Total Contributing Area (TCA) and, particularly, Speci c Catchment Area (SCA).
Given the shortfalls and problems as discussed above for error assessment using ‘real-world’ DEM, we propose a data-independent approach by employing a series of arti cial DEM de ned by analytical polynomials, which form the foundation for computing topographic parameters such as slope, aspect, catchment area and slope curvature. Using a given spatial resolution (i.e. cell size) to resample the continuous polynomial surface, a DEM can be generated on which the ‘true’ values of selected topographic parameters can be computed through inference. Based on this errorfree DEM, the output from a tested algorithm can be compared with the ‘true’ values, thus the absolute errors and their distributions can be quanti ed with con dence. 3. The ow routing algorithms in hydrological modelling For computing the catchment area on a grid DEM, a common approach is to undertake a two-step process, namely, (a) determination of ow direction from the current cell, and (b) computation of the ow distribution from the current cell (the source) to the lower neighbouring cells (the receiver). For this, there have been two generic approaches ( gure 1): 1. Single Flow Direction (SFD) algorithms where the total amount of ow should be received by a single neighbouring cell which has the maximum downhill slope with the current cell (Mark 1984, Fair eld and Leymarie 1991). 2. Multiple Flow Direction (MFD) algorithms where the ow from the current cell should be distributed to all lower neighbouring cells according to a predetermined rule (Quinn et al. 1991, Freeman 1991, Tribe 1992, Holmgren 1994, Tarboton 1997, Pilesjo¨ et al. 1998). Figure 2 shows a simple classi cation of some grid-based ow routing algorithms found in literature. Given the scope of this paper, the characteristics of some ow routing algorithms are only brie y described below. More details about most of these algorithms can be found in Wilson and Gallant (2000). Deterministic eight-node (D8) algorithm was reported by O’Callaghan and Mark
Figure 1. Two common approaches towards determination of ow direction. (a) single ow direction (SFD) algorithm, and (b) multiple ow direction (MFD) algorithm.
Error assessment of grid-based ow routing algorithms
Figure 2.
823
The classi cation of grid-based ow routing algorithms.
in 1984. On a 3×3 local window of a DEM, the slopes between the centre cell and its eight neighbouring cells are computed. The ow direction of the centre cell points to the centre of the neighbouring cell with the maximum downhill slope, and that single neighbour will receive all ow accumulated in the centre cell ( gure 1(a)). The algorithm is the most popular one, particularly in commercial GIS software, because of its simple and eYcient computation, and strong capability in dealing with local depressions and at areas (Tarboton 1997). One major weakness of the D8 algorithm is that although the centre cell can receive upstream ow from several sources, the downstream ow can only be in one direction. Thus it is not suitable for areas where divergent ow occur, such as convex slopes and ridges (Costa-Cabral and Burges 1994, Moore 1996, Wilson and Gallant 2000). Random eight-node (Rho8) algorithm (Fair eld and Leymarie 1991) recognises that the ow over a grid DEM can be arbitrary thus introduced randomness into D8. It is a stochastic version of D8 that aims to break up parallel ow paths that may be resulted from D8, and produces a mean ow direction equal to the aspect. However, it still cannot model ow dispersion (Moore 1996, Wilson and Gallant 2000). Multiple ow direction (MFD) algorithms recognise ow divergence over a natural landscape. Thus on a 3×3 local window, the ow from the centre cell dose not necessarily point to a single neighbouring cell, but rather, it may ow into all or part of downstream neighbours ( gure 1(b)). Based on this principle, a number of algorithms have been developed with various ways of distributing ow proportionally. Quinn et al. (1991) and Freeman (1991) reported diVerent implementation of the MFD algorithms, a common point of which is that they do not need to determine the ow direction for individual cell. Quinn et al. (1991) proposed a ow distribution equation according to slope and contour length ( ow width): L tan bi Fi = n i å L i tan bi i= 1
(tan b>0; n48)
(1)
824
Q. Zhou and X. L iu
where Fi is the proportional ow to be distributed into the ith neighbouring cell; L i is the ow width; and tan bi denotes the slope between the centre cell and the ith neighbouring cell. On a grid DEM, L i =ã 2/4 D d (D d=cell size) for the diagonal directions (i.e. corner adjacent cells), while L i = 1/2 D d for others (i.e. edge adjacent cells). Freeman (1991) took a similar approach but did not consider the ow width (Freeman MFD, or FMFD): (tan bi )r Fi = n (tan b>0; n48) (2) å (tan bi )r i= 1 Based on the test on a conical surface, Freeman found that r=1.1 produced the most accurate results. This model was later tested by Pilesjo¨ and Zhou (1997) on a spherical surface and it was found that r=1.0 would be more appropriate for the case. Costa-Cabral and Burges (1994) proposed the Digital Elevation Model Networks (DEMON) algorithm in a conceptually similar way to the stream-tube approach used with contour-based DEM by Moore and Grayson (1991), but implemented on a grid DEM. In DEMON, ow is generated at each cell (source cell) and routed down a stream tube until the edge of the DEM or a local depression is reached. The stream tubes are formed from the points of intersections of a line drawn in the aspect and a cell edge. The amount of ow, expressed as a fraction of the area of the source cell, is added to the ow accumulation value of the downstream cell, thus the source cell will provide an impact value on each of the cell along the stream tube. After ow has been generated on all cells and its impact on each of its ‘stream tube’ cells has been added, the nal ow accumulation value is the total upslope area contributing runoV to each cell, i.e. SCA. The DEMON algorithm was subsequently modi ed and implemented in the TAPES-G software by Moore and his co-workers (Wilson and Gallant 2000). Tarboton (1997) argued that sparse catchment areas should be avoided while computing SCA. Thus, by integrating DEMON (Costa-Cabral and Burges 1994) and the aspect-driven algorithm proposed by Lea (1992), he proposed the Deterministic in nite-node (Dinf, or D2 ) algorithm. In Dinf, on a 3×3 local window, the centroids of the centre cell and its neighbours are used to form 8 triangles. Dinf then computes the slope for each of the triangles and selects the one with the maximum downhill slope as the ow direction. The two neighbouring cells that the triangle intersects receive the ow in proportion to their closeness to the aspect of the triangle. Pilesjo¨ et al. (1998) proposed an alternative method based on the form of the surface de ned by the DEM. On a 3×3 local window, the centroids of the nine cells are used to establish the local second-order trend surface, thus the form of the window can be determined in terms of convexity and concavity. The ow distribution can then be determined according to whether the local ow is divergent or convergent. An alternative ‘feature-based’ approach was also proposed by Mackay and Band (1998) and Liang and Mackay (2000) by explicitly recognising hydrological features (e.g. lakes) in watershed extraction processes. 4. Computation of theoretical ‘true’ values The Total Contributing Area (TCA) is de ned as ‘the plane area of terrain that contributes ow to the contour segment’ (Moore and Grayson 1991). The Speci c
Error assessment of grid-based ow routing algorithms
825
Catchment Area (SCA) is de ned as ‘upslope area per unit width of contour’, which has the signi cance for studying runoV volume, steady-state runoV rate, soil characteristics, soil-water content and geomorphology (Wilson and Gallant 2000). While the Contour Length (CL) tends to zero, the SCA is de ned as the upstream length at that point (Hutchinson and Dowling 1991). T CA SCA= lim CL 0 CL
(3)
On an actual land surface, the TCA and SCA are determined by ow direction and ow line. With the exception of divergent ow cases, Flow Direction (FD) is de ned as the maximum hydraulic slope, i.e. the maximum downhill slope or its approximation. Flow Path (FP) is the trace line of the ow along the FD. Both FD and FP are controlled by the natural morphology of the land surface. The uphill area bounded by the FP at two given points on a contour line forms TCA for the contour segment, while the TCA and CL determine the SCA. According to the de nition, the TCA is the value assigned on the arc of contour. While CL tends to zero, SCA can be recognised as the TCA at that point. Thus, on a known mathematical surface, the polynomial equation can be solved at any given point on the surface, i.e. the ‘true’ value of TCA at that point. Therefore, we selected SCA as the criterion for this study to assess the accuracy of hydrological modelling algorithms. 4.1. Flow direction and ow path Given a 3-dimensional surface z= f (x, y) , its representation on a plane can be expressed as contour f (x, y)=c, where c is an arbitrary constant. For any given point P(x, y) , the f (x, y) will have the maximum value decrease along the opposite direction of the gradient at P. The value of this is thus called the maximum slope and its direction is de ned as the aspect or Flow Direction (FD). For f (x, y) , the gradient at P is expressed as: grad f (x, y)= f x i+ f y j
(4)
where i and j denote unit directions, and the norm of (4) is the slope: slope=ã When f y
f 2x + f 2y
(5)
0, the direction of the gradient is the aspect: aspect=arctan
A B fx fy
(6)
At a given point P, the aspect is perpendicular to the contour line passing P. While a point is moving downhill with gravity, its trace on the surface is presented as the FP. At any point on the FP, the tangent direction is the direction of the contour line passing that point, i.e. FP is perpendicular to contour lines at any points. Thus, for the FP g(x, y) passing point P(x, y) , it satis es the equation below: f ê (x, y)gê (x, y)= - 1
(7)
Q. Zhou and X. L iu
826
Solving the diVerential equation, we have
P
g(x, y)=
-1 f ê (x, y)
dx
(8)
4.2. T otal catchment area and speci c catchment area Illustrated by gure 3, f 2 is the contour line passing point a(x1 , y1 ), while f 1 is the contour line higher than point a. Assuming f 1 and f 2 are continuous, there is a FP g1 that passes a, and g1 and f 1 intersect at point d(x4 , y4 ). To compute the SCA at point a, an arbitrary point b(x2 , y2 ) on f 2 is selected, at which we can similarly derive the FP g2 that passes b and the intersect point c(x3 , y3 ) of g2 and f 1 . Thus, the TCA above the CL between a and b is presented by the area bounded by arcs ab, bc, cd and da, which can be expressed as:
P
T CA=
x4
x1
P
g1 dx+
x3
x4
f 1 dx -
P
x2
x1
f 2 dx -
P
x3
x2
g2 dx
(9)
The CL between a and b can be derived as:
P
CL =
x2 !
x2
x1
ã 1+( f ê (x))2dx
(10)
According to SCA de nition as expressed by equation (3), while CL! x1 , there is:
P
x4
T CA = lim x1 SCA= lim CL CL 0 x2 x1
Figure 3.
P P
g1 dx+
x3
x4 x2 x1
f 1 dx -
P
x2
x1
f 2dx -
P
x3
x2
0, i.e.
g2 dx
ã 1+( f ê (x))2 dx
The computation of speci c catchment area (SCA).
(11)
Error assessment of grid-based ow routing algorithms While point b moves close to point a along contour f 2 , CL! d (T CA) T CA dx2 = lim SCA= lim x2 x1 CL x2 x1 d (CL ) dx2
827
0, TCA! 0; Thus:
(12)
Seeking the derivatives on x2 for both nominator and denominator in equation (12), we have:
P
x2 d d ã 1+( f ê (x))2 dx (CL )= dx2 dx2 x 1
P
x4 d g1 dx=0 dx2 x 1
P
x3 d dx f 1 dx= f 1 (x3 ) 3 dx2 x dx2 4
P
x2 d f 2 dx= f 2 (x2 ) dx2 x 1
P
(13)
(14)
(15)
(16)
P
x2 x2 d dx3 - g2 (x2 ) g2 dx= gê2 (x2 ) dx+g2 (x3 ) (17) dx2 x dx2 x 1 1 Considering f 1 (x3 )=g2 (x3 ) and f 2 (x2 )=g2 (x2 ), while x2 ! x1 , there is x3 ! x4 . By integrating equations (13 to 17), we derive equation (18) from equation (11) for computing SCA at any given point on a given surface: -
P
x3
gê2 (x2 )dx x 2 SCA= lim f (x 2 x2 x1 ã 1+( ê 2 ))
(18)
5. Design of mathematical surfaces Considering the complexity of the surface in the real world, the selected arti cial mathematical surfaces need to be similar to the reality, at least to part of the real surface, since the simplest surfaces may not be able to derive meaningful results (Holmgren 1994, Jones 1998). In this study, we selected the following polynomial surfaces ( gure 4): 5.1. Ellipsoid Ellipsoid (Elliptic dome, gure 4(a)) simulates a part of ridge or convex slope that represents a water divide. Along its long axis represents the water divide on which divergent ow is found. The SCA at the centre of the ellipsoid is zero, while
828
Q. Zhou and X. L iu
Figure 4. Typical mathematical surfaces represented as 3-dimensional block diagram ( left) and contour lines: (a) ellipsoid, (b) inverse ellipsoid, (c) saddle and (d) plane. The unit represented is metres.
Error assessment of grid-based ow routing algorithms
829
the outlet of the catchment is at the edge of the area. At any given point on the surface, the ow is divergent. We derive its parameters as shown in table 1. 5.2. Inverse ellipsoid Inverse ellipsoid (Elliptic bowl, gure 4(b)) simulates a part of valley or concave slope that represents water drainage or a catchment. Along its long axis represents the water drainage line on which convergent ow is found. The outlet of the catchment is at the centre. At any given point on the surface, the ow is convergent and its parameters can be derived as shown in table 2.
Table 1. De nition and hydrological parameters of an ellipsoid. De nition Slope
slope=
Flow path
c2 z
x2 y2 z2 + + =1 a2 b2 c2
S
x2 y2 + a4 b4
y=cont ´xm
z>0
Aspect/Flow direction a2 m= 2 b
A B
aspect=arctan
b2 x a2 y
where cont denotes the integral constant.
Speci c Catchment Area (SCA)
SCA=
ã a41 y21 +b41x21 a21 +b21
where ai , bi denotes the parameters of the ellipsoid at point (x1 , y1 ).
Table 2.
De nition and hydrological parameters of an inverse ellipsoid.
De nition Slope Flow path Speci c Catchment Area (SCA)
c2 z
slope=
x2 y2 z2 + + =1 a2 b2 c2
S
x2 y2 + a4 b4
y=cont ´xm
z