accuracy and at best represent an "average" stream network (Knighton, 1998). ..... Length / Maximum length. ~. CD. "§ 0.974 -. CD. > TO 0.836-. 3. E. 3. Nan. ⢠t.
Hydrological Sciences-Journal-des Sciences Hydrologiques, 46(5) October 2001
g 13
Determination of probability distributions for Strahler stream lengths based on Poisson process and DEM MING-SANG YANG & KWAN TUN LEE Department ofRiver and Harbor Engineering, National Taiwan Ocean University, Keelung 202, Taiwan, ROC e-mail: ktlee(S)mail.ntou.edu.tw Abstract One of the basic tasks in géomorphologie analysis is to know the probability distributions of the stream lengths of different orders. In practical applications, this information is useful for basin rainfall-runoff modelling. The objective of this study is to determine the length distributions of the Strahler streams. A Poisson process was used to derive the theoretical distributions. The result showed that the length distribution of the first-order stream is an exponential distribution and the second-order or higher order stream length is a gamma distribution. In order to verify the theoretical distributions, a digital elevation model (DEM) was adopted to calculate the stream lengths of four basins in Taiwan. Kolmogorov-Smimov and chi-square tests were used to test the goodness-of-fit of the data. Results showed that the length distributions of the first- and second-order streams analysed by using DEM correspond with those from the derived distribution method. Key words géomorphologie analysis; Strahler stream ordering; stream length distribution; Poisson process; digital elevation model
Détermination des distributions de probabilité des longueurs de tronçons de Strahler sur la base d'un processus de Poisson et de MNT Résumé Une des tâches fondamentales de l'analyse géomorphologique est d'identifier les distributions de probabilité des longueurs des tronçons des différents ordres. Pratiquement, ces informations sont utiles en modélisation pluie-débit. L'objectif de cette étude est de déterminer les distributions de longueur des tronçons selon l'indexation de Strahler, sur la base d'un processus de Poisson. Le résultat montre que la longueur du tronçon d'ordre 1 suit une distribution exponentielle, et celle du tronçon d'ordre supérieur à 2 suit une distribution gamma. Afin de vérifier les distributions théoriques, un modèle numérique de terrain (MNT) est utilisé pour calculer les longueurs des tronçons de quatre bassins versants de Taiwan. Les tests de Kolmogorov-Smirnov et chi-carrés sont utilisés pour évaluer la qualité des ajustements. Les résultats montrent que les distributions expérimentales des longueurs des tronçons d'ordres 1 et 2, obtenues par analyse de MNT, correspondent aux propositions théoriques. Mots clefs analyse géomorphologique; indexation de Strahler; distribution des longueurs de tronçons; processus de Poisson; modèle numérique de terrain
INTRODUCTION The quantitative analysis of stream networks has been the subject of interest to both geomorphologists and hydrologists since the mid-twentieth century. Drainage network analysis has been used not only for the definite purpose of identifying network structure characteristics, but also as a basis for demonstrating the effects of geomorphic characteristics on hydrological systems. The quantitative study of stream networks began with Horton's (1945) method of classifying streams by order. Strahler
Open for discussion until 1 April 2002
814
Ming-Sang Yang & Kwan Tun Lee
(1952) later proposed a modification of Horton's ordering scheme. Strahler's method is now generally preferred because of its simplicity and avoidance of subjective decisions. The early works on the lengths of exterior links showed right-skewed distributions that were approximately log-normal (Schumm, 1956; Maxwell, 1960). Smart (1968) reported that the lengths of interior links can be derived by a random-walk simulation (Leopold & Langbein, 1962) and gave a good fit to an exponential distribution in two networks in the United States. Later works on interior link length in 10 networks from various areas were not exponential, but could be fitted by gamma distributions (Smart, 1969). The most comprehensive study on link length was done by Krumbein & Shreve (1970). They reported that the interior link could be fairly well approximated by a gamma density with a shape parameter of 2 for Eastern Kentucky basins. Recently, Dodds & Rothman (2001) reported that link length distributions can be approximated by exponential distributions, and the length distributions of the Strahler streams can be obtained by convoluting the link length distributions. Although many investigations have been performed regarding the link length distribution, reports for the Strahler stream length distribution, useful for geomorphic runoff modelling, were limited (e.g. Rodriguez-Iturbe & Valdes, 1979; Gupta et al., 1980; Gupta & Mesa, 1988; Lee & Yen, 1997). Since stream delineation in the field is time-consuming, most network analyses have been based on data derived from topographic maps, which can be of variable accuracy and at best represent an "average" stream network (Knighton, 1998). In this paper, it is assumed that the stream designated as blue line on topographic maps is based on consistent portrayal. Even so, laborious work is required to measure stream lengths on maps. Recently, digital elevation models (DEMs) have been frequently used to represent landscape topography (see for example, Band, 1986; Jenson & Domingue, 1988; Garbrecht & Martz, 1993; Eash, 1994; Johnson & Miller, 1997; Lee, 1998). Wang & Yin (1998) demonstrated that the accuracy of watershed geomorphic factors extracted from DEMs is comparable to those obtained by manual methods while the processing time is much less. The objective of this paper is to derive a theoretical Strahler stream length distribution based on Poisson process. To verify the result from the Poisson process assumption, géomorphologie data from four basins in Taiwan was collected. In this study, the stream network, stream order and stream length were extracted from a DEM. Statistical analysis was performed to evaluate the suitability between the theoretical assumption and watershed géomorphologie data.
DEFINITIONS OF STRAHLER STREAM LENGTHS If the streams are idealized as single lines containing no lakes, islands, nor junctions of more than two streams at the same point, the resulting diagram is known in the geomorphic literature as a trivalent planted tree, or dendritic stream network. Figure 1 shows a schematic stream network of a basin. The terminology used in this study basically follows Shreve (1966). The sources are the points farthest upstream in a stream network, and the outlet is the point farthest downstream. The point at which two streams join is called a junction. An exterior link is a segment of stream network
Determination of probability distributions for Strahler stream lengths
815
o junction _,
9
° exterior link . ° interior link
o . _, ®—e—-©
1 st-order stream „ . . 2nd-order stream
Q
—e—° 3rd-order stream
Fig. 1 Stream network with (a) link length; (b) Strahler stream length.
between a source and the first junction downstream. An interior link is a segment of stream network between two successive junctions or between the outlet and the first junction upstream. For a dendritic network, the Strahler ordering procedures involve the following rules: (1) streams that originate at a source are defined to be thefirst-orderstreams; (2) when two streams of order w join, a stream of order w + 1 is created; (3) when two streams of different orders join, the stream segment immediately downstream has the higher of the orders of the two combining streams; and (4) the order of a drainage basin is that of its highest-order stream. According to this ordering scheme, Fig. 1 illustrates the stream segment length in Strahler ordering scheme (hereafter the Strahler stream lengths) of different order streams in a third-order basin. It is evident that the first-order Strahler stream length is identical to the exterior link length, and the higher-order Strahler stream lengths consist of sequential interior link lengths. STUDY AREAS Taiwan (22°N-25°30'N, 120°E-122°E) is located between Japan and the Philippines in the Western Pacific with a total area of 36 000 km2. The island is long and narrow with the Central Mountain Range in the middle. The major rock formations of Taiwan form long narrow belts roughly parallel to the long axis of the island. There are 29 main rivers in Taiwan. All of them are short with small drainage basins and steep with rapid flows. The annual rainfall in Taiwan reaches 2500 mm which is 2.5 times the world's average. Most of the mountain regions in Taiwan are composed of sedimentary and metamorphic rocks, which are fragile and highly weathered. Accordingly, severe erosion occurs due to intensive rainfall and rapid flows.
816
Ming-Sang Yang & Kwan Tun Lee
122°
-25°
24°
23°
22°
120 121 Fig. 2 Location map of study basins in Taiwan.
122
Over 29 basins were initially identified as study areas. However, only limited number of stream segment can be obtained for analysis from small basins. For basin area larger than 600 km2, they usually include channel regulation works at downstream, which are considered inadequate to represent the natural streams characteristics. Therefore, only one sixth-order basin and three fifth-order basins were chosen for analysis. As shown in Fig. 2, the size of the study basins ranges from 254 to 476 km2. Since the sample size of higher order streams was limited in each basin, only the first- and second-order stream lengths were adopted for statistical analysis.
Determination of probability distributions for Strahler stream lengths
817
METHODS Deriving theoretical stream length distribution by Poisson process In order to obtain an analytical distribution for Strahler stream length, it was assumed that the lengths of Strahler streams in a given network are independent random variables drawn from a common population (after Smart, 1968), and all topologically distinct networks with a given number of sources are equally likely (after Shreve, 1967). Then the analytical procedure for deriving the length distribution of the Strahler stream can be well represented by a Poisson process, In this study, n independent points were considered starting at any source over a discrete stream length / of a dendritic network. For each point on this length of stream, a junction may either occur or not occur. Let the probability of a junction occurring at the z'th point on the length of stream be/? for i = 0, 1, 2,..., n. Since the Poisson process has been assumed to apply, i. e. the occurrence of an event at any point on the length of stream is independent of the history of any prior occurrence or non-occurrence, the probability of/ occurrences (junctions) in n independent points is given by: fj{j;n,p)=(n\>J(i-Prj
(i)
where/ = 0, 1, 2,..., n. Equation (1) is a binomial distribution. If the discrete length of stream is allowed to become a continuous length of stream (i.e. n —» °°), the probability p of a junction occurring in an element of length A/(=//«) gets smaller and the number of points n increases. In this situation, np remains constant, and the random process is a Poisson process. Hence equation (1) can be rewritten as (Larsen & Marx, 1986): ^
/,(/;a/) = lim . P^-PT1'^-^- e
\h
-ufiiV {XI)
P-
(2)
where / > 0; X > 0; j = 1, 2,..., n, and lim p = ÀA/
A/-»0
(3)
in which equation (2) is a Poisson distribution, and the mean and variance are both equal to XI. The focus of this study is the probability distribution of the stream length / having x occurrences. It can be expressed after Larsen & Marx (1986) as:
fL(l;x,X) = T \ r ^ '
(4)
(x-lj! Equation (4) is an Erlang distribution, which is a special case of the gamma density, and JC(= 1, 2, ..., °o) is usually called the shape parameter. The shape parameter (i.e. the number of junctions) obtained from the statistical analyses of streams in a natural network is usually not an integer. In order to include non-integer shape parameter cases, a gamma function T(x) is used to replace the (x- 1)! term. Therefore, equation (4) is formally introduced in a gamma distribution, which can be expressed as:
Ming-Sang Yang & Kwan Tun Lee
818
fL(l;x,%)=X VU r{x)
(5)
The distribution shown in equation (5) has a mean xlX and variance x/k2. For the first-order stream, only one junction occurs in length /. Substituting x = 1 into equation (5), one obtains a negative exponential distribution that can be expressed as: fL(l-X*) =
te-U
(6) 2
The distribution shown in equation (6) has a mean 1/À, and variance l/k . Equation (6) represents an exponential distribution. The exponential distribution and Erlang distribution are the special cases of the gamma distribution. These densities are intimately connected with the Poisson process. Considering the second-order streams, the Poisson process starts at the upstream end of the second-order stream. Since a second-order stream may consist of sequential interior links, the number of junctions is at least one in the second-order stream. Therefore, for the case of the shape parameter x > 1, the probability distribution of the second-order stream length can be expressed as a gamma distribution as shown in equation (5). Similarly, the stream length distributions for orders higher than two can be wholly considered as gamma distributions.
Extracting stream length by DEM In general, the extraction of a stream network using a DEM is performed on a spatial grid system in which potential paths of flow are determined according to the elevations in the eight cells that are spatially adjacent in the grid. A series of subroutines were designed for depression removal, flow direction assignment, flow accumulation determination, and stream network delineation. The detail procedures can be found in O'Callaghan & Mark (1984) and Jenson & Domingue (1988), therefore not repeated here. The Center for Space and Remote Sensing Research at the National Central University, Taiwan, provided 40-m resolution DEM data (raster data structure) for Taiwan. A series of FORTRAN programs called WAGIS (Watershed Geomorphic Information System) was developed. The algorithms were based on Jenson & Domingue (1988) using a "threshold area" concept to extract stream network from the DEM data. Consequently, stream networks are delineated by estimating a threshold of flow accumulation, which is essentially the minimum drainage area that can support a stream element. All cells having a flow accumulation equal to or greater than the threshold area value are then assumed to contain a stream element. However, as described by Tarboton et al. (1991) and Montgomery & Foufoula-Georgiou (1993), inconsistencies were usually found in the DEM-extracted stream networks and those on the topographic maps. In this study, a threshold area value was used to extract the prior stream network by the DEM. The trial threshold area value, applied to extract stream networks in Taiwan, was around 300 cells with respect to 40-m resolution DEM data set. Accordingly, the DEM-extracted stream network was then compared with the blue-line
Determination of probability distributions for Strahler stream lengths
819
on the 1:25 000 topographic maps. Explicit discrepancies between the DEM network and the topographic map network were adjusted by assigning new channel-head points for those DEM unidentified channel reaches, and by eliminating channel reaches for those undesirable stream elements. An algorithm was designed based on Garbrecht & Martz (1997) to determine Strahler stream order in raster data set. The Strahler stream lengths were then calculated by tracing the flow paths of each cell and tallying the total number of the ordered cells. Statistical analysis of stream length distributions A chi-square test and a Kolmogorov-Smirnov test were conducted to evaluate the suitability of the derived length distribution and the watershed geomorphic data. Values of the chi-square test statistic %2 were calculated from:
x^iP^hL k=\
(7)
£*
where Ok is the number of observations in the Mi class interval; £* is the number of observations expected in the Mi class interval (according to the distribution being tested); r| is the number of class intervals. The chi-square test divides the range of the stream length into several intervals and then compares the number of observations in each class to the number expected based on the fitted distribution. The number of classes is determined from (Sturges, 1926): Tl = l + 3.31og10m
(8)
where m is the sample size. An alternative to the chi-square goodness-of-fit test is the Kolmogorov-Smirnov test. The Kolmogorov-Smirnov test statistic D is based on the maximum difference between the sample cumulative distribution function and the theoretical cumulative distribution function, which is defined by: D = max|Fe - St |
(9)
where FE is the claimed theoretical cumulative distribution function under the null hypothesis and Sz is the sample cumulative density function based on e observations. RESULTS AND DISCUSSION As mentioned above, only first- and second-order streams in the study basins were chosen for analysis because the number of higher order streams is limited. The total sample size of the first-order streams is 828, and for the second-order streams, 182. In order to realize length distributions of the first- and second-order streams, the stream length data were plotted on exponential and gamma probability papers (Figs 3 and 4), respectively. The method of maximum likelihood was used to estimate the parameters of the hypothetical distributions. By using equation (8), the stream length data were separated into several classes. The expected and the observed length histograms for
Ming-Sang Yang & Kwan Tun Lee
820
0.995 -
Sung-Mao
"§ 0.974 -
>
0.836 •
observed
3
E
o
expected 0.004 •
-. 0
o
1
1
TO 0 . 8 3 6 3
E 3
-^-
observed
—
expected
™ t
r
0.2 0.4 0.6 0.8 Length / Maximum length
1.0
0.0
1 ' 1 ' 1 ' 1 ' 0.2 0.4 0.6 0.8 1.0 Length / Maximum length
0.995 •
u.aao X! CD XI
Af Ag-J
CD
CD
3
/r"
CD
X! O 0.974 -
.>
Nan -Kang
~
CD
Yen -Ping
./
_,
©
0.974 -
.> 0.836 -
— o —
3
E O
observed
s — observed
expected
expected
1
1 ' 1 ' 1 ' 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0 .0 Length / Maximum length Length / Maximum length Fig. 3 Exponential probability plots of the first-order stream lengths for the four study basins. 0.980
0.980 • Sung-Mao
CD
CD
% 0.858-
"o 0 . 9 0 3 -
3 — observed
3
expected
3
0.020 • 0 0.980 -
,.n m p Q.
.> ID 0.659 -
—© -
3
E
XI
4r
01
> la 0.424-
°
Nan-Kang
X!
XI
T T 0.2 0.4 0.6 0.8 Length / Maximum length
—
E
observed
- expected
3
1.0
°
0.020 0.0
0.2 0.4 0.6 0.8 Length / Maximum length
1.0
>, 0.980 Yen-Ping
0.902
CD
> 0.607 CD observed
3
E 3
o
expected 0.020
1 0
'
I
r
0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Length / Maximum length Length / Maximum length Fig. 4 Gamma probability plots of the second-order stream lengths for the four study basins.
Determination of probability distributions for Strahler stream lengths
1.0
Sung-Mao
c 0.8 CD
821
observed
observed
expected
expected
I 0.6
0.0
0.2
0.4
0.6
0.8
1.0
0.0
Length / Maximum length 1.0
- Yen-Ping
0.2
0.4
0.6
0.8
1.0
Length / Maximum length Cho-Lu
observed
observed
c 0.8
expected
expected
CD
I 0.6 H
c'X^Xb'
0.0
0.2
0.4
T~ 0.6
T 0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Length / Maximum length Length / Maximum length Fig. 5 The expected exponential distribution and the observed first-order stream lengths for the four study basins.
Nan-Kang
observed expected
X^ ^ t X X X
0.0
0.2
0.4
0.6
0.8
1.0
0.0
Length / Maximum length 1.0 o c 0.8
1.0
Yen-Ping
3
observed
Cho-Lu
c OH
5%j expected
CD
> c>
H U
•
53 ra o .o3 & BÛT3 o aj g o
ao
Determination of probability distributions for Strahler stream lengths
823
different classes are shown in Figs 5 and 6 for the first- and second-order streams, respectively. Detailed statistical results from the chi-square and Kolmogorov-Smirnov tests are given in Tables 1 and 2. Table 1 shows the test results to determine whether the first-order stream lengths can be adequately modelled by an exponential distribution. The assumption that the first-order stream length comes from an exponential distribution cannot be rejected at 95% confidence level. (It was found that the shape parameters of the first-order stream lengths in the study basins are close to one, which means that the stream length corresponds with the results from the derived distribution based on the Poisson process.) Table 2 shows the results of the goodness-of-fit tests for the second-order stream lengths. The second-order stream length can be modelled by a gamma distribution (95% confidence). Moreover, in deriving the length distribution based on the Poisson process, the shape parameter represents the number of junctions. For the first-order stream, only one junction occurs in length /. Consequently, the number of junctions should be greater or equal to one for the second-order streams. As listed in Table 2, the shape parameters of the second-order stream lengths are between 1.6 and 2.7, which corresponds with the derived distribution result based on the Poisson process.
CONCLUSIONS The probability distribution of the Strahler stream length was investigated. If the constitution of a stream network follows a Poisson process, the probability distribution of the first-order stream length is an exponential distribution. For second-order or higher order streams, the stream length is gamma-distributed. By using a DEM to obtain geomorphic data in Taiwan, it was shown that the length distributions of the first- and second-order streams in Taiwan could be well fitted by exponential and gamma distributions, respectively. The results from the theoretical derivation can be applied in geomorphological runoff modelling for runoff travel-time estimation, which will be addressed in a subsequent study. Acknowledgements This study is part of research works supported by the National Science Council, Taiwan, ROC, under grants NSC 88-2625-Z-019-002 and NSC 892625-Z-019-001. Financial support from the National Science Council is gratefully acknowledged. Thanks are also given to Mr Ying-Min Wu and Ms Yuh-Ju Chyan for the DEM calculation work. REFERENCES Band, L. E. (1986) Topographic partition of watershed with digital elevation models. Wat. Resour. Res.22 (1), 15-24. Dodds, P. S. & Rothman, D. H. (2001) Geometry of river networks II: distributions of component size and number. Phvs. Rev. E,63, 1-15. Eash, D. A. (1994) A geographic information system procedure to quantify drainage-basin characteristic. Wat. Resour. Bull. 30, 1-8. Garbrecht, J. & Martz, L. W. (1993) Network and subwatershed parameters extracted from digital elevation models—the Bills Creek experience. Wat. Resour. Bull 29, 909-916. Garbrecht, J. & Martz, L. W. (1997) Automated channel ordering and node indexing for raster channel networks. Compul. Geosci. 23(9), 961-966.
824
Ming-Sang Yang & Kwan Tun Lee
Gupta, V. K., Waymire, E. &.Wang, C. T. (1980) A representation of an instantaneous unit hydrograph from geomorphology. Wat. Resour. Res.l6(5), 855-862. Gupta, V. K. & Mesa, O. J. (1988) Runoff generation and hydrologie response via channel network geomorphology— recent progress and open problems. J. Hydro]. 102, 3-28. Horton, R. E. (1945) Erosional development of streams and their drainage basins: hydrophysical approach to quantitative morphology. Bull. Geol. Soc. Am. 56, 275-370. Jenson, S. K. & Domingue, J. O. (1988) Extracting topographic structure from digital elevation data for geographic information system analysis. Photogramm. Engng Remote Sens. 54(11), 1593-1600. Johnson, D. L.& Miller, A. C. (1997) A spatially distributed hydrologie model utilizing raster data structures. Comput. Geosci. 23(3), 267-272. Knighton, D. ( 1998) Fluvial Forms and Processes: A New Perspective. Arnold, London, UK. Krumbein, W. C. & Shreve, R. L. (1970) Some statistical properties of dendritic channel networks. Tech. Rep. no. 13, Office of Naval Research Task 389-150, Dept. of Geol. Sci., Northwestern Univ., Evanston, Illinois, USA. Larsen, R. J. & Marx, M. L. (1986) An Introduction to Mathematical Statistics and Its Applications (second edn). Prentice-Hall, Englewood Cliffs, New Jersey, USA. Lee, K. T. (1998) Generating design hydrographs by DEM assisted geomorphic runoff simulation: a case study. J. Am. Wat. Resour. Ass. 34(2), 375-384. Lee, K. T. & Yen, B. C. (1997) Geomorphology and kinematic-wave based hydrograph derivation. J. Hydraul. Engng, ASCE 123(1), 73-80. Leopold, L. B. & Langbein, W. B. (1962) The concept of entropy in landscape evolution. US Geol. Surv. Prof. Paper no. 500-A, 1-20. Maxwell, J. C. (1960) Quantitative geomorphology of the San Dimas National Forest. Tech. Report no. 19, Project NR 389-042, Dept. of Geol., Columbia Univ., New York, USA. Montgomery, D. R. & Foufoula-Georgiou. E. (1993).Channel network source representation using digital elevation models. Wat. Resour. Res.29(U), 3925-3934. O'Callaghan, J. & Mark, D. M. (1984) The extraction of drainage networks from digital elevation data. Computer Vision, Graphics and Image Processing 28, 323-344. Rodriguez-Iturbe, I. & Valdes, J. B. (1979) The géomorphologie structure of hydrologie response. Wat. Resour. Res.l5(6), 1409-1420. Schumm, S. A. (1956) Evolution of drainage systems and slope in badlands at Perth Amboy, New Jersey. Bull. Geol. Soc. Am. 67, 597-646. Shreve, R. L. (1966) Statistical law of stream numbers. J. Geol. 74, 17-37. Shreve, R. L. (1967) Infinite topologically random channel networks. J. Geol. 75, 178-186. Smart, J. S. (1968) Statistical properties of stream lengths. Wat. Resour. Res. 4, 1001-1014. Smart, J. S. (1969) Distribution of interior link lengths in natural channel networks. Wat. Resour. Res.5(6), 1337-1342. Strahler, A. N. (1952) Hypsometric (area-altitude) analysis of erosional topography. Bull. Geol. Soc. Am. 63, 1117-1142. Sturges, H. A. (1926) The choice of a class interval. J. Am. Statist. Ass. 21, 65-66. Tarboton, D. G., Bras, R. L. & Rodriguez-Iturbe, I. (1991) On the extraction of channel networks from digital elevation data. Hydrol. Processes 5(1), 81-100. Wang, X. & Yin, Z.-Y. (1998) A comparison of drainage networks derived from digital elevation models at two scales. J. Hydrol. 210, 221-241.
Received 15 January 2001; accepted 21 May 2001