A new bivariate Gamma distribution generated from functional scale ...

Stoch Environ Res Risk Assess (2013) 27:1039–1054 DOI 10.1007/s00477-012-0641-6

ORIGINAL PAPER

A new bivariate Gamma distribution generated from functional scale parameter with application to drought data Muhammad Mohsin • Albrecht Gebhardt Ju¨rgen Pilz • Gunter Spo¨ck

•

Published online: 5 September 2012 Ó Springer-Verlag 2012

Abstract Univariate and bivariate Gamma distributions are among the most widely used distributions in hydrological statistical modeling and applications. This article presents the construction of a new bivariate Gamma distribution which is generated from the functional scale parameter. The utilization of the proposed bivariate Gamma distribution for drought modeling is described by deriving the exact distribution of the inter-arrival time and the proportion of drought along with their moments, assuming that both the lengths of drought duration (X) and non-drought duration (Y) follow this bivariate Gamma distribution. The model parameters of this distribution are estimated by maximum likelihood method and an objective Bayesian analysis using Jeffreys prior and Markov Chain Monte Carlo method. These methods are applied to a real drought dataset from the State of Colorado, USA. Keywords Gamma distribution Maximum likelihood estimates Likelihood ratio test Bayesian estimates Jeffreys prior Drought modeling 1 Introduction The man-induced changes and the erratic behaviour of the climate are the main threats to the water resource management M. Mohsin (&) A. Gebhardt J. Pilz G. Spo¨ck Department of Statistics, Alpen-Adria University, 9020 Klagenfurt, Austria e-mail: [email protected] A. Gebhardt e-mail: [email protected] J. Pilz e-mail: [email protected] G. Spo¨ck e-mail: [email protected]

in the modern age. Rapid increase in the world population, deforestation, river regularization, soil sealing, low precipitation, increase in temperature, less or no rain and irregular distribution of resources cause many problems of which water deficit or drought is a major one. Whatever might be the reasons of water deficit; drought can have serious health, social, economic and political impacts like poverty, hunger, thirst, wildfire, disease, migration, land degradation, erosion etc. and thus is becoming a big challenge in future. It occurs all around the equator in hot and dry climatic regions and sustains for a long period having far-reaching impacts. Mishra and Singh (2010, 2011) express their concerns about the frequently occurring droughts in recent years and their impacts because of the increasing demand of water and variability in hydro-meteorological variables due to climate change. They also discuss different concepts of drought, various methodologies used for drought modeling such as drought forecasting, probability based modeling, spatio-temporal analysis, use of Global Climate Models (GCMs) for drought scenarios, land data assimilation systems for drought modeling and drought planning. Significant improvement has been observed in drought modeling over the time. In this paper, we attempt to provide a new and flexible model for the inter-arrival time and the proportion of drought periods. Since the scale parameter is customized functionally therefore our proposed model is preferred to the other models. This merit of the model provides an extensive application as it can be modified according to the given circumstances. Many authors use these types of models in different areas of hydrology. Hallack-Alegria and Watkins (2007) conduct a meteorological drought intensity–duration–frequency analysis based on annual and warm season precipitation records to estimate the return period for different years. Porporato et al. (2001) suggest a measure of

123

1040

vegetation water stress which combines mean intensity and the duration of the drought period i.e. soil water deficit period. Dupuis (2010) develops a new model for dry period inter-arrival times and analyzes different duration characteristics of dry and wet periods based on monthly Palmer drought severity indices. Nadarajah (2008, 2009a) proposes bivariate Pareto and bivariate F-distributions to model drought data by deriving the exact distribution of the interarrival time of drought events, the magnitude of drought events and the proportion of drought events respectively. Kim et al. (2006) propose a nonparametric method to estimate the joint distribution of drought properties. Song and Singh (2010) use a trivariate Plackett copula to derive the joint probability distribution of drought duration, severity and inter-arrival time where the drought duration and interarrival time follow the Weibull distribution each and the drought severity follows the Gamma distribution based on stream flow data. Hao and Singh (2011) propose a method based on entropy theory to construct a bivariate distribution that is capable of modeling drought duration and severity with different marginal distributions. Different forms of bivariate Gamma distributions are extensively used to model different hydrological events e.g. flood, rain, precipitation, drought etc. Clarke (1980) applies a bivariate Gamma distribution to an extension of stream flow records that are correlated with longer records of precipitation. Yue (2001) uses a bivariate Gamma distribution in multi flood frequency analysis. Nadarajah (2007) uses a bivariate Gamma model for drought data and derives the distribution of inter-arrival time and proportion. Yue et al. (2001) review various bivariate Gamma distribution models which are constructed from the Gamma marginals applied frequently in multivariate hydrological events. Cheng et al. (2010) demonstrate a frequency factor based approach for stochastic simulation of a bivariate Gamma distribution which is capable of generating random sample pairs describing marginal densities of random variables as well as their correlation coefficients. Some other types of univariate and bivariate Gamma distributions are used in the field of hydrology as well, see Husak et al. (2007) Prekopa and Szantai (1978), Nadarajah and Gupta (2006a, b), Nadarajah and Kotz (2006), Loaiciga and Leipnik (2005). The material of this paper is arranged as follows: The construction of the new bivariate Gamma distribution is presented in Sect. 2. Our main interest here is to model the distribution of inter-arrival time of drought X ? Y and of the proportion of drought duration to the total length. The explicit expressions for the probability density function and the moments of S ¼ X þ Y and W ¼ X=ðX þ Y Þ are derived in Sects. 3 and 4. An expression for the estimation of the model parameters by maximum likelihood (ML) is provided in Sect. 5. The estimation of the model parameters by Bayesian and ML methods using drought data for the Colorado state along

123

Stoch Environ Res Risk Assess (2013) 27:1039–1054

with their goodness of fit is presented in Sect. 6 and some concluding remarks are stated in Sect. 7. In this paper, analytical expressions involve some special functions such as the gamma function Cð:Þ, the modified Bessel function of the second kind Ka ð:Þ of order a and the confluent hypergeometric function Uð: Þ with parameters a and b, which are defined as follows: Z1 CðaÞ ¼ wa1 expðwÞ dw: 0

pffiffiffi a Z1 1 px ðw2 1Þa2 expðxwÞ dw: Ka ðxÞ ¼ a 2 C a þ 12 1

Uðz; a; bÞ ¼

1 CðaÞ

Z1

wa1 ð1 þ wÞba1 expðzwÞ dw:

0

The details and properties of these special functions can be studied in Prudnikov et al. (1986). 2 Construction of the new bivariate Gamma distribution In this section we derive the bivariate Gamma distribution as a compound distribution of two Gamma variates. The bivariate Gamma distribution is defined below: Let the random variable X have a Gamma distribution with shape parameter a and scale parameter b. The probability density function then is: f ðx; a; bÞ ¼

ba a1 x expðbxÞ; CðaÞ

a; b [ 0; x [ 0:

ð1Þ

Suppose another random variable Y has a Gamma distribution with shape parameter c and scale parameter /ð xÞ, where /ð xÞ is some function of X. The probability density function of Y then is: f ðy; c; /ð xÞÞ ¼

ð/ð xÞÞc c1 y expð/ð xÞyÞ; CðcÞ

c; /ð xÞ [ 0; y [ 0:

ð2Þ

The bivariate Gamma distribution is defined as the compound distribution of (1) and (2). The probability density function of the new bivariate distribution is thus given as: ba ð/ð xÞÞc a1 c1 x y expððbx þ /ð xÞyÞÞ; CðaÞ CðcÞ a; b; c; /ð xÞ [ 0; x; y [ 0:

f ðx; yÞ ¼

ð3Þ

The bivariate probability distribution (3) can be used to produce several bivariate probability distributions depending on the choice of /ð xÞ. Since there is an insignificant


1041

correlation between the drought duration and non-drought duration therefore in this paper we use /ð xÞ ¼ dx in (3) to obtain the following new bivariate Gamma distribution: ba dc ac1 c1 y f ðx; yÞ ¼ x y exp bx þ d ; x CðaÞ CðcÞ a; b; c; d [ 0; x; y [ 0: ð4Þ

transformation T ¼ X and W ¼ X=ðX þ Y Þ in the proposed bivariate distribution (4) we get the joint pdf of T and W as

The marginal density of Y can be obtained from (4) by integrating it over x and is given as: Z1 ba dc c1 y y gð y Þ ¼ xac1 exp bx þ d dx: ð5Þ x C ð aÞ C ð c Þ

(9) reflects that T and W are independent where T follows the Gamma distribution and W follows some beta type distribution, see Theorem 2 below.

0

The integral in (5) can be calculated by using equation (2.3.16.1) in Prudnikov et al. (1986) yielding gð yÞ as pffiffiffiffiffiffiffiffi 2b 2 d 2 ðaþcÞ 1 gð y Þ ¼ y 2 Kac 2 bdy ; CðaÞCðcÞ ðaþcÞ

ðaþcÞ

a; b; c; d; y [ 0; ð6Þ

where Kn ð:Þ is the modified Bessel function of the second kind. The s-th moment of (6) is given as EðY s Þ ¼

q

Cða þ sÞ Cðc þ sÞ : ðb dÞs CðaÞ CðcÞ

EðX Y Þ ¼

ð7Þ

Cða þ pÞ Cða þ qÞCðc þ qÞ bpþq dq ðCðaÞÞ2 CðcÞ

dc c1 r expðdr Þ; C ð cÞ

In this section we derive the probability density function of the sum and the ratio when X and Y are distributed according to (4). Theorem 1 If X and Y are jointly distributed according to (4), then the pdf of S is given as 1 b a dc X ðbÞi aþi1 s Uðd; c; c a i þ 1Þ; CðaÞ i¼0 i!

0\s\1;

:

ð8Þ

It is observed that if R ¼ XY , then X and R are independent. The probability density function of R is given as f ðr Þ ¼

3 Probability density function of S 5 X 1 Y X and W ¼ XþY

f ðsÞ ¼

From this, the mean and variance of Y can easily be calculated. The p-th and q-th product moment of (4) is p

ba dc ð1 wÞc1 ta1 CðaÞ CðcÞ wcþ1 d ð1 w Þ exp b t ; t [ 0; 0\w\1: ð9Þ w

f ðt; wÞ ¼

c; d [ 0; r [ 0:

For simplicity, the derived variables S and W can be 1 written as S ¼ Xð1 þ RÞ and W ¼ ð1þRÞ : In addition to the proportion R ¼ XY ; the hydrologists are interested in the joint distribution of ðX; X=ðX þ Y ÞÞ which shows the dependency between the drought duration and the proportion of the drought events. Nadarajah (2009b) and Porporato et al. (2001) use it quite effectively, which adds more value to the well known result that if X is Gamma then the second component of the joint distribution of ðX; X=ðX þ Y ÞÞ follows the beta distribution. This not only highlights the dependency between the Gamma and beta distributions but also proves to be useful to study the duration and the proportion of drought. In this paper we apply the continuous bivariate Gamma distribution successfully to model the drought data where X (drought duration) and Y (non-drought duration) have positive discrete values. Applying the

ð10Þ

where Uðd; c; c a i þ 1 Þ is the confluent hypergeometric function. X Proof Using the transformation S = X ? Y and W ¼ XþY in (4), the joint pdf of S and W is given as

f ðs; wÞ ¼

ba dc sa1 wac1 ð1 wÞc1 CðaÞ CðcÞ d ð1 wÞ exp b s w : w

ð11Þ

The pdf of S can be obtained as Z1 b a dc a1 f ðsÞ ¼ s wac1 ð1 wÞc1 CðaÞCðcÞ 0 dð1 wÞ exp bsw dw: w ( ) Z1 1 X b a dc ðbswÞi c1 a1 ac1 s ¼ w ð1 wÞ CðaÞCðcÞ i! i¼0 0 dð1 wÞ exp dw: w Z1 1 b a dc X ðbÞi aþi1 s ¼ waþic1 ð1 wÞc1 CðaÞCðcÞ i¼0 i! 0 dð1 wÞ exp dw: ð12Þ w

123

1042


Substituting t ¼ w1 1 in (12), we get a

¼

c

b d CðaÞ CðcÞ

1 X ðb Þi i¼0

i!

saþi1

Z1 0

t

EðX p Y q Þ ¼

bpþq dq ðCðaÞÞ2 CðcÞ

c1

ðt þ 1Þaþi

:

The proof of (16) can be completed by using (8) in the following expression

expðd tÞ dt:

ð13Þ

Using the definition of the confluent hypergeometric function in (13), we obtain

EðSq Þ ¼ EðX þ YÞq ¼

q X q j

EðX j Þ EðY qj Þ:

j¼0

Theorem 4 If X and Y are jointly distributed according to (4), then the p-th order moments of W are given as

1 ba dc X ðbÞi aþi1 s Uðd; c; c a i þ 1 Þ; CðaÞ i¼0 i! 0\s\1:

f ðsÞ ¼

EðW p Þ ¼ dc Uðd; c; c p þ 1Þ:

Theorem 2 If X and Y are jointly distributed according to (4), then the pdf of W is given as dc ð1 wÞc1 dð1 wÞ gðwÞ ¼ exp ; w CðcÞ wcþ1

Proof

ð17Þ

From (14), we have

dc EðW Þ ¼ CðcÞ p

Z1 w

pc1

ð1 wÞ

c1

d ð1 wÞ exp dw: w

0

ð18Þ

ð14Þ

0\w\1:

Proof Using (11), the marginal density of W can be obtained as a

Cða þ pÞ Cða þ qÞCðc þ qÞ

dc EðW Þ ¼ CðcÞ

c

b d gðwÞ ¼ wac1 ð1 wÞc1 CðaÞ CðcÞ Z1 d ð1 wÞ exp sa1 expðb s wÞ ds: w

Setting t ¼ w1 1 in (18), we get p

Z1

tc1 expðd tÞ dt: ðt þ 1Þp

ð19Þ

0

ð15Þ

Using the definition of the confluent hypergeometric function gives the proof of (17).

0

Solving the integral in (15), we get gðwÞ ¼

dc ð1 wÞc1 dð1 wÞ exp ; w CðcÞ wcþ1

0\w\1:

X 4 Moments of S 5 X 1 Y and W ¼ XþY

In this section we derive the moments of S = X ? Y and X W ¼ XþY when X and Y are distributed according to (4). Theorem 3 If X and Y are jointly distributed according to (4), then the q-th moment of S is given as q X 1 q EðSq Þ ¼ q j 2 b ðCðaÞÞ CðcÞ j¼0 Cða þ jÞ Cða þ q jÞ Cðc þ q jÞ : dqj

ð16Þ

5 Estimation of the parameters by maximum likelihood method In this section we estimate the model parameters of the new bivariate Gamma distribution by using the maximum likelihood method. The log likelihood function of (4) is given as LðgÞ ¼ na ln b þ n c ln d n ln CðaÞ n ln CðcÞ n n X X þ ð a c 1Þ lnðxi Þ þ ðc 1Þ lnðyi Þ i¼1

b

n X i¼1

xi d

i¼1

n X yi i¼1

xi

ð20Þ

The partial derivates of (20) with respect to a; b; c; d and then equated to zero are given as

EðX p Y q Þ ¼ EðX pþq Rq Þ ¼ EðX pþq Þ EðRq Þ:

n X oLðgÞ ¼ n ln b n wðaÞ þ lnðxi Þ ¼ 0; oa i¼1

We know from (8) that the p-th and q-th product moment of (4) is

where wð:Þ is the first derivative of log gamma function (also called Psi or digamma function),

Proof

Since X and R are independent,

123

ð21Þ

Stoch Environ Res Risk Assess (2013) 27:1039–1054 n oLðgÞ n a X ¼ xi ¼ 0; ob b i¼1

1043

ð22Þ

An approximate ð1 nÞ100% confidence interval, e.g. for a is constructed as

n n X X oLðgÞ ¼ n ln d n wðcÞ lnðxi Þ þ lnðyi Þ ¼ 0; oc i¼1 i¼1

ð23Þ n oLðgÞ n c X yi ¼ ¼ 0: od d x i¼1 i

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^ ^c; ^d ¼ ði; iÞ entry in½Qðg^Þ1 : r â; b;

ð24Þ

Solving the above nonlinear equations (21–24) numerically, ^ ^c and ^ we get estimated values ^ a; b; d. The first derivatives with respect to a and b contain only the values of xi and neither depend on yi nor on c and d: ^ are the same as Therefore the ML-estimates of ^ a and b those of the univariate Gamma distribution discussed by Johnson et al. (1994). The Fisher information matrix Qðg^Þ can be obtained by taking second derivatives of the log-likelihood function (20). The matrix Qðg^Þ with corresponding entries is given in the Appendix. Asymptotic confidence intervals can be constructed on the basis of the approximate normal distribution of the ML ^ ^c; d ^ with standard errors estimate g^ ¼ ^ a; b;

â Z rðâÞ; where z is the ð1 n=2Þ quantile of the standard normal distribution.

6 Application: drought data In this Section we use drought data from the state of Colorado, USA to estimate the model parameters of (4) by means of Bayesian and ML methods. The drought data is available on: http://www.ncdc.noaa.gov/oa/climate/onlineprod/drought/ xmgr.html which contains the monthly modified Palmer Drought Severity Index (PDSI) from January 1895 to December 2010. The PDSI is based on the idea of the stability between moisture supply and demand. Since PDSI is used to quantify the long-term drought conditions for a given location and time therefore it is appropriate for the current discrete dataset. Alley (1984) describes PDSI as a meteorological drought

Fig. 1 The five climate divisions of the state of Colorado

123

1044


Table 1 Drought duration and non-drought duration data for Colorado climate division 1 Case

1

Non-drought duration (months) 51

Drought duration (months)

Case

Non-drought duration (months)

Drought duration (months)

3

29

1

13

2

1

2

30

18

29

3

10

7

31

3

5

4

4

14

32

1

2

5

4

1

33

2

14

6

6

9

34

20

22

7

5

5

35

5

5

8

5

7

36

4

4

9

18

3

37

21

1

10

3

4

38

3

6

11

193

4

38

1

1 1

12

1

6

40

11

13

72

76

41

6

7

14 15

16 5

9 3

42 43

1 12

14 7

16

61

10

44

6

5

17

36

26

45

40

9

18

7

59

46

10

1

19

16

5

47

29

4

20

5

3

48

1

6

21

7

6

49

4

3

22

16

33

50

3

32

23

1

2

51

1

1

24

2

2

52

11

5

25

5

18

53

1

8

26

8

6

54

14

11

27

2

6

55

3

10

28

21

7

56

12

5

index used to measure the dryness based on the precipitation and temperature. He finds that PDSI gives the spatial and temporal representation of the historical droughts along with the historical perspective of the current weather conditions. Willeke et al. (1994) believe PDSI as the most useful

monitoring and measuring tool for soil moisture conditions and to start or end drought contingency plans. Kogan (1995) and Hu and Willson (2000) state that PDSI is widely used to monitor the drought in the United States. Palmer arbitrarily selects the classification scale varying from -6.0 to ?6.0 depending upon the moisture conditions. Alley (1984) modifies the Palmer scale from -4.0 to ?4.0 where -0.49 to ?0.49 refers the moisture condition near to normal. Yevjevich (1967) and Guerrero-Salazar and Yevjevich (1975) give the concept of statistical theory of run which proposes that drought takes place when the value of PDSI is less than 0. Therefore we take 0 as a threshold below which the values show drought conditions and above which the values show the wet conditions with relative intensities. Changing the threshold value from 0 to any other number may affect the results. The state of Colorado has five climate divisions which are numbered and named as 1 (Arkansas Drainage), 2 (Colorado Drainage), 3 (Kansas Drainage), 4 (Platte Drainage) and 5 (Rio Grande Drainage). These five climate divisions are marked by the US Bureau of Census and Climate Data Centre which can be seen from the web site: http://www.cpc.ncep.noaa.gov/products/ analysis_monitoring/regional_monitoring/CLIM_DIVS/. Figure 1 shows the geographical map of these five divisions. For illustrative purpose, the real drought data of drought duration and nondrought duration for climate division 1 is given in Table 1. The descriptive statistics of these five divisions are exhibited in Table 2. We obtain data on drought duration and non-drought duration for each climate division by using PDSI data and our focus is to 1. 2. 3.

estimate the model parameters of (4) by Bayesian and ML method. determine the distribution of inter-arrival time of drought (S) = drought duration ? non-drought duration. determine the distribution of proportion of drought (W) = drought duration/(drought duration ? nondrought duration).

Since model (4) is based on the assumption that X and Y/X are independent therefore drought duration and non-drought

Table 2 Descriptive statistics for Colorado PDSI data Climate division

Number of drought

Drought frequency (number/year)

Mean drought duration (months)

Standard deviation of drought duration (months)

Mean non-drought duration (months)

Standard deviation of non-drought duration (months)

1

56

0.483

10.125

13.555

14.732

28.449

2

55

0.474

12.127

13.916

13.127

24.077

3

60

0.517

9.550

15.201

13.150

23.741

4

45

0.380

11.244

15.014

19.222

38.766

5

42

0.362

16.191

30.118

16.976

27.175

123


1045

Table 3 Pearson’s Correlation test for independence for drought duration and non-drought duration/drought duration Climate division

p value

1

0.184

2

0.148

3

0.161

4

0.237

5

0.221

duration are used for the analysis. In the general spectrum one of the physical elucidations of this assumption is that the relative ratio of drought duration and non-drought duration is independent of the drought duration. This assumption is verified by applying the Pearson’s correlation test for independence between the observed values of drought duration and non-drought duration/drought duration. Table 3 displays the p values of the Pearson’s correlation test for independence. There is a clear evidence of no association between X and R for all the five climate divisions of Colorado state.

6.1 Bayesian estimation of model parameters using MCMC Bayesian methodology combines prior information with the sample information to make inference. The Bayesian posterior density is proportional to the prior density times the likelihood. The objective Bayesian approach in this paper uses the so-called non-informative Jeffreys prior. A Markov Chain Monte Carlo (MCMC) simulation with Gaussian proposal distribution is used to estimate the model parameters from the posterior distribution implemented in the R-package MCMCpack proposed by Martin et al. (2011). We use 520000 MCMC iterations with a burn-in of 20000 values and in the remaining chain only every 100th sample is retained in order to get 5000 independent samples. Convergence is monitored by the trace plots for each parameter. The posterior distributions of the parameters are approximately Gaussian. The variance– covariance matrix V for the Gaussian proposal distribution selects the optimum value from the ‘‘tune’’ parameter. The variance–covariance matrix is calculated as:

2000

3000

4000

0.0

1.5 0.5

5000

0.5

1.0

1.5

2.0

alpha

alpha

trace plot

posterior density Density

0

0.05

10

1000

0.20

0

1.5


trace plot

2000

3000

4000

5000

0.05

0.10

0.15

0.20

beta

trace plot

posterior density

0.8

0

0.4

4

beta

2

1000

Density

0

2000

3000

4000

5000

0.4

0.6

0.8

gamma

trace plot


0

0.10

1.0

8

gamma

4

1000

0.30

0

0

1000

2000

3000

delta

4000

5000

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

delta

Fig. 2 Shapes of the posterior distributions of the model parameters a, b, c and d for climate division 1

123

1046


Table 4 Summary statistics of the posterior distributions for a, b, c and d Parameters

Mean

Median

Standard deviation

25th Percentile

75th percentile

95 % credible interval

For climate division 1 a

1.124

1.115

0.189

0.996

1.240

(0.797, 1.510)

b

0.113

0.111

0.024

0.096

0.127

(0.072, 0.162)

c

0.586

0.580

0.093

0.524

0.645

(0.425, 0.774)

d

0.189

0.186

0.045

0.159

0.216

(0.113, 0.281)


1.215

1.201

0.207

1.071

1.341

(0.851, 1.652)

b

0.102

0.010

0.021

0.087

0.115

(0.066, 0.147)

c

0.484

0.479

0.076

0.431

0.531

(0.351, 0.643)

d

0.155

0.152

0.039

0.128

0.178

(0.091, 0.238)

For climate division 3 a 0.875

0.869

0.139

0.782

0.959

(0.630, 1.156)

b

0.093

0.092

0.020

0.080

0.106

(0.059, 0.135)

c

0.483

0.480

0.073

0.435

0.528

(0.355, 0.632)

d

0.126

0.124

0.030

0.105

0.145

(0.075, 0.188)


1.123

1.105

0.211

0.978

1.253

(0.770, 1.558)

b

0.102

0.010

0.024

0.085

0.116

(0.061, 0.153)

c

0.454

0.452

0.078

0.401

0.503

(0.315, 0.617)

d

0.100

0.098

0.028

0.081

0.117

(0.054, 0.159)


0.682

0.673

0.127

0.596

0.761

(0.462, 0.950)

b

0.044

0.043

0.011

0.036

0.051

(0.025, 0.067)

c

0.588

0.581

0.108

0.512

0.656

(0.399, 0.812)

d

0.184

0.180

0.050

0.150

0.215

(0.101, 0.286)

V ¼ T ðQÞ1 T:

6.2 ML-estimation of model parameters

where T, the diagonal positive definite matrix, is formed by the ‘‘tune’’ and Q is the approximate Hessian of the function. For all estimations we tune the variance–covariance matrix V at 1.5. Here, we show only the shape of the posterior distributions of the model parameters along with trace plots for the climate division 1. The shape of the posterior distributions of parameters and trace plots for the other four divisions also follow the Gaussian distribution and chains in the trace plots seem to be quite stable and well mixing. The acceptance rates of the MCMC Metropolis Hasting sampling algorithm for all the five climate divisions are approximately 21 % each. Figure 2 shows the shape of the posterior distributions of the parameters along with trace plots for the climate division 1. Table 4 shows the summary statistics of the posterior distributions for the different parameters of the proposed bivariate distribution (4) using Jeffreys’s prior.

The maximum likelihood estimates with the standard errors based on the inverse information matrix and the 95 % confidence intervals (based on the normal approximation) of the model parameters along with negative logarithm of the maximized likelihood (NL) values for the drought duration (X) and non-drought duration (Y) for five climate divisions are exhibited in Table 5. The Newton–Raphson algorithm implemented in R package maxLik (Henningsen and Toomet (2011)) is used to maximize the likelihood. It is observed that MCMC estimated parameters and their standard deviations are quite similar to those estimated by ML method which reflects the stability of the estimates of the model parameters. We note that although the standard errors of ML estimates are considerably small, their estimates seem to be different across the divisions. So it is important to check whether the distribution of drought duration and nondrought duration given in (4) is the same across the five

123


1047

Table 5 Estimated parameters by ML method Estimates

Climate divisions 1

2

3

4

5

â ^ b

1.071

1.160

0.839

1.063

0.645

0.106

0.096

0.088

0.095

0.040

^c ^d

0.563

0.465

0.466

0.434

0.558

0.177

0.143

0.117

0.091

0.168

SE(â) ^ SE(b)

0.179

0.197

0.133

0.198

0.119

0.022

0.020

0.019

0.022

0.011

SE(^c) ^ SE(d)

0.089

0.073

0.070

0.074

0.102

0.042

0.036

0.028

0.026

0.046

Lower 95 % C.I. for a

0.720

0.774

0.578

0.675

0.412

Upper 95 % C.I. for a

1.422

1.546

1.100

1.451

0.878

Lower 95 % C.I. for b

0.063

0.057

0.051

0.052

0.018

Upper 95 % C.I. for b

0.149

0.135

0.125

0.138

0.062

Lower 95 % C.I. for c

0.389

0.322

0.329

0.289

0.358

Upper 95 % C.I. for c

0.737

0.608

0.603

0.579

0.758

Lower 95 % C.I. for d

0.095

0.072

0.062

0.040

0.078

Upper 95 % C.I. for d

0.259

0.214

0.172

0.142

0.258

-398.318

-406.924

-414.365

-338.438

-319.366

NL

climate divisions or not. If this assumption is fulfilled then it will lead us to pool the droughts data for all the five divisions. Since the distribution (4) is based on four parameters we state the null and alternative hypotheses as H0 :

a1 ¼ a2 ¼ a3 ¼ a4 ¼ a5 ¼ a; b1 ¼ b2 ¼ b3 ¼ b4 ¼ b5 ¼ b; c1 ¼ c2 ¼ c3 ¼ c4 ¼ c5 ¼ c; d1 ¼ d2 ¼ d3 ¼ d4 ¼ d5 ¼ d

versus not all ai s are equal, not all bi s are equal, H1: not all ci s are equal, not all di s are equal For this purpose we carry out likelihood ratio tests (LRT). First we combine all the droughts data of all the five divisions. Under this hypothesis we find the ML estimates ^ ¼ 0:078; ^c ¼ 0:487 and ^ ^ a ¼ 0:901; b d ¼ 0:132 with standard ^ ¼ 0:008; SEð^cÞ ¼ 0:035 and errors SEð^ aÞ ¼ 0:069; SE b SE ^ d ¼ 0:015 for the joint distribution of drought duration and non-drought duration, respectively. The value of the negative logarithm of the maximized likelihood (NL) for all five divisions is 1887.508. The NL value without any restriction under null hypothesis is 1877.411 (the sum of the last row of Table 5). Under H0, the LRT, which is twice the difference between these two NL values, has an approximate Chi square distribution with 8 degrees of freedom. Thus, 2 ð1887:508 1877:411Þ ¼ 20:194 [ v28;0:05 ¼ 15:507; and therefore we reject the null hypotheses that five divisions of Colorado are homogeneous

Fig. 3 Contour plot of the fitted pdf(4) for drought duration and nondrought duration for the climate division 1

with respect to the joint distribution of drought duration and non-drought duration. Hence, we conclude that there is a significant difference among the five divisions for the drought data, so we cannot pool them. The contour plot of (4) for the drought duration and non-drought duration for the climate division 1 using its ML estimates is presented in Fig. Fig. 3. Next we check whether (4) provides an adequate fit for the drought data for all the five climate divisions or not.

123

1048


Fig. 4 Probability plots of the marginal X and Y density of (4) for the observed values of the drought duration (X) and non-drought duration (Y) dataset for climate a division 1, b division 2, c division 3, d division 4, e division 5

This can be checked by probability plots. In probability plots, the observed probability is plotted against the predicted probability for the fitted model. To check the goodness of fit of (4) for the bivariate distribution of drought duration (X) and non-drought duration (Y) we draw two probability plots for X and Y, respectively. This can be done by plotting FX ðxi Þ versus ði 0:375Þ=ðn þ

123

0:25Þ as recommended by Blom (1958) and Chambers et al. (1983), where FX ð:Þ represents the marginal cumulative distribution function (CDF) of X corresponding to (4) and xi are the sorted values in ascending order of X. Similarly, we check the goodness of fit of (4) for Y by plotting FY ðyi Þ versus ði 0:375Þ=ðn þ 0:25Þ, where FY ð:Þ represents the marginal CDF of Y corresponding to (4) and


1049

Fig. 5 Fitted values of the pdf (10) of S = X ? Y and (14) of W = X/(X ? Y), where X is drought duration and Y is non-drought duration for the climate a division 1, b division 2, c division 3, d division 4, e division 5

123

1050


Table 6 Model comparison Divisions

Our proposed model

Nadarajah’s (2009b) model

NL

AIC

NL

AIC

1

-398.318

-788.636

-392.804

-779.608

2

-406.924

-805.848

-388.401

-770.802

3

-414.365

-820.73

-405.240

-804.480

4

-338.438

-668.876

-332.033

-658.066

5

-319.366

-630.732

-314.025

-622.05

yi are the sorted values in ascending order of Y. However, the CDF of Y is not in closed form so we use numeric solvers in order to get the quantiles for probability plot for Y. Figure 4a–e represent the probability plots for X and Y for five climate divisions, respectively. The fit appears to be reasonably good for drought duration X where X follows the standard Gamma distribution. However the fit for the non-drought duration Y appears to be questionable and the respective results should be used cautiously. The reason might be that the non-drought duration datasets have relatively more variations than the drought duration datasets. Another possible reason might be the corresponding function is complicated to some extent. Further, the probability density function (pdf) of interarrival time S and proportion of drought W is fitted by using the Eqs. (10) and (14) taking their respective ML estimates for all the five divisions of Colorado. These fitted distributions are compared to the histograms of

inter-arrival time S and proportion of drought W of these five divisions. The fitted pdf for inter-arrival time S approximately follows the general pattern of the histogram and looks reasonably well fitted. However the fitted pdf for proportion W is debatable as it somewhat deviates from the general pattern of histogram and therefore suggests to use the corresponding results cautiously. Figure 5a–e show the histograms along with their respective fitted pdfs for five divisions. Further our proposed model can be compared with other existing bivariate models. Of these existing models, Nadarajah (2009b) uses a three parameter bivariate distribution to model the drought duration and non-drought duration. We apply and compare our model with Nadarajah’s (2009b) model on the basis of negative log likelihood value (NL) and Akaike’s information criterion (AIC) using the same drought data of the Colorado State. The results of this comparison are given in Table 6.

Table 7 Return period for (a) drought duration, (b) non drought duration, (c) inter-arrival time Divisions

3-Years

5-Years

1

3.958

9.054

2

3.767

8.869

3

4.647

4

1.478

5

10-Years

25-Years

50-Years

100-Years

15.831

24.692

31.356

38.000

15.649

24.511

31.176

37.820

9.724

16.492

25.347

32.009

38.651

6.679

13.497

22.379

29.053

35.702

0.956

6.196

13.023

21.911

28.587

35.238

1

2.834

14.018

41.513

98.836

157.776

230.001

2

2.577

13.467

40.584

97.422

156.001

227.867

3

3.860

16.099

44.954

104.030

164.277

237.801

4

0.427

7.779

30.431

81.553

135.875

203.517

5

0.186

6.727

28.389

78.252

131.637

198.339

(c) 1

9.453

25.182

55.878

117.286

179.513

255.027

2

8.950

24.516

54.874

115.798

177.633

252.803

3

11.306

27.654

59.606

122.798

186.323

263.153

4

3.314

17.235

43.795

98.894

156.457

227.393

5

2.117

15.757

41.547

95.370

151.989

221.994

(a)

(b)

123

Stoch Environ Res Risk Assess (2013) 27:1039–1054 Table 8 Estimated percentile for (a) inter-arrival time and (b) the proportion of drought for five climate divisions of Colorado

p

Inter-arrival time for division 1

1051





(a) 0.05

1.344

1.781

0.706

1.517

0.500

0.10

2.681

3.401

1.659

3.052

1.486

0.15

4.094

5.066

2.782

4.693

2.840

0.20

5.613

6.829

4.069

6.477

4.542

0.25

7.262

8.725

5.534

8.439

6.601

0.30

9.073

10.794

7.201

10.625

9.047

0.35

11.082

13.079

9.104

13.088

11.928

0.40

13.334

15.638

11.292

15.903

15.313

0.45

15.893

18.545

13.834

19.171

19.301

0.50 0.55

18.841 22.295

21.903 25.857

16.823 20.399

23.033 27.698

24.029 29.694

0.60

26.426

30.619

24.766

33.472

36.585

0.65

31.485

36.514

30.239

40.828

45.141

0.70

37.873

44.055

37.328

50.512

56.064

0.75

46.243

54.102

46.903

63.745

70.544

0.80

57.761

68.173

60.521

82.649

90.777

0.85

74.726

89.252

81.370

111.413

121.301

0.90

102.690

124.538

117.152

160.160

173.404

0.95

161.250

199.529

195.724

265.589

278.965

p

Proportion for division 1





0.05

0.012

0.018

0.031

0.051

0.014

0.10

0.088

0.099

0.134

0.170

0.096

0.15

0.182

0.186

0.231

0.268

0.192

0.20

0.267

0.264

0.311

0.344

0.278

0.25

0.342

0.330

0.377

0.407

0.352

0.30

0.407

0.388

0.433

0.459

0.417

0.35

0.464

0.438

0.482

0.504

0.473

0.40

0.514

0.483

0.525

0.543

0.523

0.45

0.560

0.525

0.564

0.579

0.567

0.50

0.602

0.563

0.599

0.612

0.608

0.55

0.640

0.599

0.633

0.642

0.646

0.60

0.676

0.633

0.664

0.672

0.681

0.65

0.711

0.665

0.695

0.700

0.715

0.70

0.744

0.698

0.724

0.727

0.748

0.75 0.80

0.776 0.809

0.730 0.763

0.754 0.784

0.755 0.783

0.779 0.811

0.85

0.842

0.780

0.816

0.813

0.844

0.90

0.878

0.836

0.851

0.847

0.879

0.95

0.919

0.884

0.895

0.889

0.920

(b)

Table 6 highlights that both criteria provide the minimum values for our proposed model as compared to Nadarajah’s (2009b) model for all five divisions of Colorado State reflecting that our proposed model gives better fit than Nadarajah’s (2009b) model.

Estimation of the return period of the drought events is another important feature of drought analysis. Bonaccorso et al. (2003) argue that return period is an important statistic to characterize the drought and provide information about improvements in the water system

123

1052


management under dry conditions. The return period is defined in different ways depending upon its application. Lloyd (1970), Loaicigica and Marinõ (1991) and Shiau and Shen (2001) define the return period as the average elapsed time between the occurrence of the critical events. On the other hand, Vogel (1987), Bras (1990) and Douglas et al. (2002) define the return period as the average number of trials required to the first occurrence of the critical event. Haan (1977) considers the return period of a variable as a standard criterion in water resources system planning and management. In this paper, the return period defined in terms of drought duration (X), non-drought duration (Y) and inter-arrival time (S) is given as FX ð x Þ ¼ 1

1 ; Nx Tx

ð25Þ

FY ð y Þ ¼ 1

1 ; Ny Ty

ð26Þ

FS ðsÞ ¼ 1

1 ; Ns Ts

ð27Þ

respectively, where F ð:Þ are the respective CDF of the random variables X; Y and S. T is the return period and N is drought frequency per year. The return period given by (25)–(27) for T = 3, 5, 10, 25, 50 and 100 years are exhibited in Table 7a–c. Many important results can be derived from Table 7a–c, for instance, drought is expected to occur once in every 10 years having duration 16 months for climate division 1, 16 months for climate division 2, 16 months for climate division 3, 13 months for climate division 4 and 13 months for climate division 5 respectively. In the same way, Table 7b and c can be interpreted. Similarly, Shiau (2003) defines the return period for two hydrological variables either by the joint return periods for X and Y or by the conditional return periods X for the given Y or vice versa. For example, the drought duration exceeding a specific value and the nondrought duration exceeding another specific value i.e. ðX [ x; Y [ yÞ; the inter-arrival time of drought exceeding a specific value given that the drought duration has exceeded a threshold i.e. ðS [ s; X [ xÞ etc. Since these relationships can be applied on various combinations of two hydrological variables therefore it can result in the same return period. Finally, we provide some quantiles zp associated with the CDF’s of (10) and (14), respectively. These quantiles are computed numerically by solving the equations Zz f ðsÞ ds ¼ p ð28Þ 0

123

and Zz

f ðwÞ dw ¼ p:

ð29Þ

0

We use the function uniroot in R software for the numerical solution of the equations (28) and (29). Table 8a–b provide some important numerical values of zP using ML estimates for each climate division. We hope these will be useful for environmental scientists and practitioners. 7 Conclusion Drought characteristics are better explained by deriving its joint distribution as it is a multivariate phenomenon. We have presented a new bivariate Gamma distribution generated from a functional scale parameter to model the interarrival time and proportion of drought using drought duration and non-drought duration dataset. In some situations the standard distributions do not work properly due to random and uncertain behaviour of extreme events. Appropriate selection of the functional relationship /ð xÞ makes this class of distributions very flexible so that one can use it according to the given circumstances. In this paper the selection of /ð xÞ ¼ d=x is justified by the fact that there is an insignificant correlation between the drought duration and non-drought duration. This flexibility further suggests to extend this class of distributions to multivariate distributions by adopting the same procedure. An application of the bivariate Gamma distribution to drought data from the Colorado state is presented. The proposed bivariate Gamma distribution (4) seems to be a reasonable model for the drought data. We derived the explicit distributions of inter-arrival time (S) and proportion of drought (W) when drought duration (X) and nondrought duration (Y) follow the proposed bivariate distribution and checked their fitting to the observed data. The distributions of S provide an adequate fit to the observed drought data for all five climate divisions whereas the distribution of W is questionable and therefore suggests to use the corresponding results carefully. We also estimated the model parameters of the proposed bivariate distribution by MCMC based on Jeffreys prior and by the maximum likelihood method. These estimates are quite stable and have small standard deviations. The fitted bivariate Gamma distribution is used to estimate the return periods for 3, 5, 10, 25, 50 and 100 years for drought duration, non-drought duration and inter-arrival time. These estimates can help the water resource management for future planning.

Stoch Environ Res Risk Assess (2013) 27:1039–1054 Acknowledgments The authors are thankful to the Associate editor and the two referees for their valuable comments and suggestions which significantly helped to improve the paper. The first author is also thankful to the Higher Education Commission of Pakistan for their financial support for this project.

Appendix The Fisher information matrix Qðg^Þ is given by: 3 2 2 o LðgÞ o2 LðgÞ 0 0 2 oa ob 7 6 oa 2 7 6 o Lð g Þ 7 6 0 0 2 7 6 ob Qðg^Þ ¼ E6 7 2 2 o LðgÞ o LðgÞ 7 6 6 oc od 7 oc2 5 4 2

o2 LðgÞ od2

w0 ðaÞ

6 6 b1 ¼ n6 6 4 0 0

b1

0

a b2

0

0

w0 ðcÞ

0

1d

0

3

7 0 7 7 7 1d 5 c d2

Here w0 ð:Þ is the first derivative of the Psi function (also called trigamma function). References Alley WM (1984) The Palmer Drought Severity Index: limitations and assumptions. J Clim Appl Meteorol 23:1100–1109 Blom G (1958) Statistical estimates and transformed beta-variables. Wiley, New York Bonaccorso B, Cancelliere A, Rossi G (2003) An analytical formulation of return period of drought severity. Stoch Environ Res Risk Assess 17:157–174 Bras RL (1990) Hydrology: an introduction to hydrologic science. Addison-Wesley, Reading Chambers J, Cleveland W, Kleiner B, Tukey P (1983) Graphical methods for data analysis. Chapman and Hall, London Cheng KS, Hou JC, Liou JJ, Wu YC, Chiang JL (2010) Stochastic simulation of bivariate Gamma distribution: a frequency-factor based approach. Stoch Environ Res Risk Assess 25(2):107–122 Clarke RT (1980) Bivariate gamma distribution for extending annual stream flow records from precipitation: some large sample results. Water Resour Res 16:863–870 Douglas EM, Vogel RM, Kroll CN (2002) Impact of streamflow persistence on hydrologic design. J Hydrol Eng 7(3):220–227 Dupuis DJ (2010) Statistical modeling of the monthly Palmer drought severity index. J Hydrol Eng 15(10):796–808 Guerrero-Salazar P, Yevjevich V (1975) Analysis of drought characteristics by the theory of runs. Hydrology Paper Nr. 80, Colorado State University, Fort Collins Haan CT (1977) Statistical methods in hydrology. Iowa State University Press, Ames Hallack-Alegria M, Watkins DW Jr (2007) Annual and warm season drought Intensity–Duration–Frequency analysis for Sonora, Mexico. J Clim 20(9):1897–1909 Hao Z, Singh VP (2011) Bivariate drought analysis using entropy theory. 2011 Symposium on Data-Driven Approaches to Droughts, Paper 43. http://docs.lib.purdue.edu/ddad2011/43

1053 Henningsen A, Toomet O (2011) maxLik: a package for maximum likelihood estimation in R. J Comput Stat 26:443–458 Hu Q, Willson GD (2000) Effects of temperature anomalies on the Palmer Drought Severity Index in the central United States. Int J Climatol 20:1899–1911 Husak JG, Michaelsen J, Funk C (2007) Use of the gamma distribution to represent monthly rainfall in Africa for drought monitoring applications. Int J Climatol 27:935–944 Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions, vol 1. Wiley, New York Kim TW, Valdes JB, Yoo C (2006) Nonparametric approach for bivariate drought characterization using Palmer Drought Index. J Hydrol Eng 11(2):134–143 Kogan FN (1995) Droughts of the late 1980s in the United States as derived from NOAA polar-orbiting satellite data. Bull Am Meteor Soc 76:655–668 Lloyd EH (1970) Return period in the presence of persistence. J Hydrol 10(3):202–215 Loaiciga HA, Leipnik RB (2005) Correlated gamma variables in the analysis of microbial densities in water. Adv Water Resour 28:329–335 Loaicigica M, Marinõ MA (1991) Recurrence interval of geophysical events. J Water Resour Plan Manag 117(3):367–382 Martin AD, Quinn KM, Park JH (2011) MCMCpack: Markov Chain Monte Carlo in R. J Stat Softw 42(9):1–21 Mishra AK, Singh VP (2010) A review of drought concepts. J Hydrol 391:202–216 Mishra AK, Singh VP (2011) Drought modeling: a review. J Hydrol 403:157–175 Nadarajah S (2007) A bivariate gamma model for drought. Water Resour Res 43:W08501. doi:10.1029/2006WR005641 Nadarajah S (2008) The bivariate F distribution with application to drought data. Statistics 42(6):535–546 Nadarajah S (2009a) A bivariate Pareto model for drought. Stoch Environ Res Risk Assess 23:811–822 Nadarajah S (2009b) A bivariate distribution with gamma and beta marginals with application to drought data. J Appl Stat 36(3):277–301 Nadarajah S, Gupta AK (2006a) Cherian’s bivariate gamma distribution as a model for drought data. Agrociencia 40:483–490 Nadarajah S, Gupta AK (2006b) Intensity-duration models based on bivariate gamma distribution. Hiroshima Math J 36:387–395 Nadarajah S, Kotz S (2006) A note on the correlated gamma distribution of Loaiciga and Leipnik. Adv Water Resour 30:1053–1055 Porporato A, Laio F, Ridolfi L, Rodriguez-Iturbe I (2001) Plants in water-controlled ecosystem: active role in hydrologic processes and response to water stress: III. Vegetation water stress. Adv Water Resour 24:725–744 Prekopa A, Szantai T (1978) New multivariate gamma distribution and its fitting to empirical stream flow data. Water Resour Res 14:19–24 Prudnikov AP, Brychkov YA, Marichev OI (1986) Integrals and series, vols 1–3. Gordon and Breach Science Publishers, Amsterdam Shiau JT (2003) Return period of bivariate distributed hydrological events. Stoch Environ Res Risk Assess 17:42–57 Shiau J, Shen HW (2001) Recurrence analysis of hydrologic droughts of differing severity. J Water Res Plan Manag 127(1):30–40 Song SB, Singh VP (2010) Frequency analysis of droughts using the Plackett copula and parameter estimation by genetic algorithm. Stoch Environ Res Risk Assess 24:783–805 Vogel RM (1987) Reliability indices for water supply systems. J Water Resour Plan Manag 113(4):645–654 Willeke G, Hosking JRM, Wallis JR, Guttman NB (1994) The National Drought Atlas. Institute for Water Resources Rep.

123

1054 94-NDS-4. U.S. Army Corps of Engineers, Fort Belvoir, VA, 587 pp Yevjevich V (1967) An objective approach to definitions and investigations of continental hydrologic drought. Hydrol. Pap., 23, Colorado State University, Fort Collins

123

Stoch Environ Res Risk Assess (2013) 27:1039–1054 Yue S (2001) A bivariate gamma distribution for use in multivariate flood frequency analysis. Hydrol Process 15:1033–1045 Yue S, Ouarda TBMJ, Bobee B (2001) A review of bivariate Gamma distribution for hydrological application. J Hydrol 246:1–18

A new bivariate Gamma distribution generated from functional scale ...

A new bivariate Gamma distribution generated from functional scale ...

Suggest Documents

a bivariate generalisation of gamma distribution

Stochastic Simulation of Bivariate Gamma Distribution âA Frequency ...

A new generalized Weibull distribution generated by

A new bivariate exponential distribution for modeling ... - Springer Link

Estimation of a Bivariate Extreme Value Distribution

On a bivariate Kumaraswamy type exponential distribution

Functional Neurons Generated from T Cell-Derived

Dependence Measures in Bivariate Gamma Frailty ...

Bivariate gamma type distributions for modeling

Comparing Distance Measures Between Bivariate Gamma Processes

Simulating from a gamma distribution with small shape parameter

Functional Analysis of Dendritic Cells Generated from T-iPSCs from ...

A new extension of bivariate FGM copulas

Gamma Distribution - Paul Johnson

Gamma Distribution

Bivariate Splines for Spatial Functional Regression Models

Inverse Gamma Distribution

Gamma Distribution Fitting - NCSS.com

Gamma Distribution and Gamma Approximation - Computer Science ...

The Gamma Function and Gamma Distribution

Bivariate functional equations around associativity - Springer Link

Weighted Marshall-Olkin Bivariate Exponential Distribution - IITK

Bivariate Poisson-Weighted Exponential Distribution ...

Product Moments of Bivariate Wishart Distribution

A new bivariate Gamma distribution generated from functional scale ...