Parameter Estimation for the Truncated Pareto Distribution - CiteSeerX

186 downloads 254 Views 197KB Size Report
May 15, 2004 - mm at Tombstone, Arizona between July 1, 1893 to December 31, ... observations of positive daily total precipitation in Tombstone, AZ between.
Parameter Estimation for the Truncated Pareto Distribution Inmaculada B. Aban1 , Mark M. Meerschaert1 , and Anna K. Panorska2 Department of Mathematics and Statistics, University of Nevada May 15, 2004 Abstract. The Pareto distribution is a simple model for nonnegative data with a power law probability tail. In many practical applications, there is a natural upper bound that truncates the probability tail. This paper develops parameter estimates for the truncated Pareto distribution, including a simple test to detect whether a given sample is truncated. We illustrate these methods with applications from finance, hydrology and atmospheric science. Keywords and phrases: Pareto distribution, truncation, maximum likelihood estimator, order statistics, tail behavior.

1. INTRODUCTION A random variable X has a Pareto distribution if P (X > x) = Cx−α for some α > 0 (see Johnson et al. [25] or Arnold [4]). This distributional model is important in applications because many data sets are observed to follow a power law probability tail, at least approximately, for large values of x. Stable distributions with index α and max-stable type II extreme value distributions with index α are also asymptotically Pareto in their probability tails, and this fact has been used frequently to develop tail estimators for those distributions. In some applications, there is a natural upper bound on the probability tail that truncates the Pareto law. In other cases, there is empirical evidence that a truncated Pareto gives a better fit to the data. Since both sums (if α < 2) and extremes (for any α > 0) fall in 1Partially

supported by NSF grant DMS-0139927

2Partially

supported by NSF grants DMS-0139927 and ATM-0231781 1

2

a different domain of attraction depending on whether a data set follows a Pareto or truncated Pareto distribution, it is also useful to test for possible truncation in a data set with evidence of power law tails. In this paper, we develop parameter estimates for the truncated Pareto distribution that are easy to compute, we prove their consistency (and in some cases, asymptotic normality), and we develop a simple test to distinguish between truncated and unbounded Pareto distributional models on the basis of data. These methods should be useful to practitioners in areas of science and engineering where power law probability tails are prevalent. Heavy tail random variables are important in applications to finance [16, 17, 24, 28, 31, 32, 34, 38], physics [5, 26, 27, 33, 35], hydrology [2, 8, 9, 23, 29, 44], engineering [36, 40, 41, 45] and many other fields [1, 18, 42]. Recently, Burroughs and Tebbens [12, 13] have surveyed evidence of truncated power law distributions in data sets on earthquake magnitudes, forest fire areas, fault lengths (on Earth and on Venus), and oil and gas field sizes. Earthquake fault sizes are limited by the physical (or terrain) considerations, see Scholtz and Contreras [43] or Pacheco et al. [37]. The size of forest fires may be naturally limited by the availability of fuel and climate, see Malamud et al. [30]. Additional applications of truncated power law distributions in finance, groundwater hydrology, and atmospheric science are given in Section 5 of this paper. Burroughs and Tebbens estimate parameters of the truncated Pareto distribution by leastsquares fitting on a probability plot [11] and by minimizing mean squared error fit on a plot of the tail distribution function [13]. Minimum variance unbiased estimators for the parameters of a truncated Pareto law were developed in Beg [6, 7]. Beg [6] also provides maximum likelihood estimates of the lower truncation parameter, scale and probability of exceedance for a truncated Pareto distribution. The maximum likelihood estimator of α, when the lower truncation limit is known was presented in [14], with some recommendations for the case when the lower truncation limit was not known. This paper develops maximum likelihood estimators (MLE) for all parameters of a truncated Pareto distribution. Existence and uniqueness of the MLE are proven under certain conditions that are shown to hold with probability approaching one as the sample size increases. A simple formula for the MLE is

3

obtained that can be easily computed. Asymptotic normality is established for the estimator of the tail parameter α, and a simple test is proposed to distinguish between Pareto and upper-truncated Pareto models. We also consider the case where only the upper tail of the data fits a truncated Pareto distribution, and we compute the conditional MLE based on the upper order statistics. These results can be used for robust tail estimation for truncated models with power law tails (e.g., stable data). Examples from hydrology, climatology and finance illustrate the utility and practical application of these results. 2. MAXIMUM LIKELIHOOD ESTIMATION FOR THE ENTIRE SAMPLE A random variable W has a Pareto distribution function if (2.1)

P (W > w) = γ α w−α

for w ≥ γ > 0, and α > 0.

An upper-truncated Pareto random variable, X, has distribution (2.2)

FX (x) = P (X > x) =

γ α (x−α − ν −α ) 1 − (γ/ν)α

and density αγ α x−α−1 fX (x) = 1 − (γ/ν)α

(2.3)

for 0 < γ ≤ x ≤ ν < ∞ where γ < ν. Consider a random sample X1 , X2 , . . . , Xn from this upper-truncated distribution and let X(1) ≥ X(2) ≥ . . . ≥ X(n) denote their order statistics. We now derive the maximum likelihood estimator (MLE) for the unknown parameters. We start by considering the simplest case, where γ and ν are both known. Theorem 2.1. The MLE α ˜ of α for an upper-truncated Pareto defined in (2.2) where γ and ν are known solves the equation n n (γ/ν)α˜ ln(γ/ν) X + − [ln X(i) − ln γ] = 0 α ˜ 1 − (γ/ν)α˜ i=1 n

(2.4)

Furthermore, α ˜ has an asymptotic normal distribution with asymptotic mean α and asymptotic variance

(

  2 )−1 γ α γ ln 1 ν − ν   γ α 2 α2 1− ν

4

Proof. The likelihood function under this setting is given by (2.5)

L(γ, ν, α) =

n n Y Y αn γ nα −α−1 X I{0 < γ ≤ Xi ≤ ν < ∞}, [1 − (γ/ν)α ]n i=1 i i=1

where γ < ν. The natural logarithm of this likelihood is (2.6)

−n ln[1 − (γ/ν)α ] + n ln α + n α ln γ − (α + 1)

n X

ln X(i) .

i=1

for 0 < γ ≤ Xi ≤ ν < ∞, i = 1, . . . , n. Taking the derivative with respect to α, we get n n (γ/ν)α˜ ln(γ/ν) X + − [ln X(i) − ln γ]. α ˜ 1 − (γ/ν)α˜ i=1 n

Setting the above equation to zero and solving, we get the MLE for α. To obtain the asymptotic distribution of the MLE α, ˜ note that the density in (2.3) may be written as fX (x) = exp{B(α)T (x) − A(α)} H(x) where A(α) = − ln(αγ α ) + ln[1 − (γ/ν)α ],

B(α) = α,

T (x) = − ln x,

and H(x) =

x−1 I{0 < γ ≤ x ≤ ν < ∞}. Hence, when γ and ν are fixed and known, this upper-truncated Pareto model is a one-parameter exponential family of distributions. Applying the results on the asymptotic properties of the MLE for a one-parameter exponential family (see for instance Theorem 5.3.5 in [10]), α ˜ has an asymptotic normal distribution with an asymptotic mean of α and an asymptotic variance given by the inverse of the Fisher information. In this case, the Fisher information is found to be   2 γ α γ ln ∂ 2 A(α) 1 ν = 2 − ν   γ α 2 ∂α2 α 1− ν  Remark 2.2. The MLE for α presented in equation (2.4) appeared first in [14] for the case, when the lower truncation limit γ was known. The authors also had recommendations on the computation of the estimate when γ was unknown. Next we obtain the MLEs for the parameters of an upper-truncated Pareto where α, γ, and ν are all unknown.

5

Theorem 2.3. Consider a random sample X1 , X2 , . . . , Xn from an upper-truncated Pareto defined in (2.2) where α, γ and ν are unknown. The maximum likelihood estimators for the parameters in this model are given by γˆ = X(n) = min(X1 , X2 , . . . , Xn ),

νˆ = X(1) = max(X1 , X2 , . . . , Xn )

and α ˆ solves the equation n n [X(n) /X(1) ]αˆ ln[X(n) /X(1) ] X + − [ln X(i) − ln X(n) ] = 0 α ˆ 1 − [X(n) /X(1) ]αˆ i=1 n

(2.7)

Proof. The likelihood function for this model is given by n n Y Y αn γ nα −α−1 L(γ, ν, α) = X I{γ ≤ Xi ≤ ν} [1 − (γ/ν)α ]n i=1 i i=1

Note that

Qn

i=1

I{γ ≤ Xi ≤ ν} = 1 if and only if X(n) ≥ γ and X(1) ≤ ν. Since the

likelihood function L is an increasing function of γ for γ ≤ X(n) and a decreasing function of ν for ν ≥ X(1) , then for a fixed α, the likelihood is maximized when γˆ = X(n) and νˆ = X(1) . Replacing γ and ν by X(n) and X(1) , respectively, and taking the natural log of the likelihood, we get α

−n ln[1 − (X(n) /X(1) ) ] + n ln α + n α ln X(1) − (α + 1)

n X

ln X(i) .

i=1

Taking the partial derivative with respect to α and setting it equal to zero, the MLE for α is α ˆ which solves the equation (2.7).



When γ and ν are unknown, the upper-truncated Pareto distribution has support that depends on these unknown parameters, and hence, it is not a member of a multiparameter exponential family of distributions. Furthermore, it does not satisfy the regularity conditions necessary to directly apply the asymptotic properties of MLEs to derive the asymptotic distribution of α. ˆ We will instead use a Taylor series expansion to obtain the asymptotic properties of α. ˆ First we need to show that γˆ and νˆ are consistent estimators for γ and ν, respectively. Lemma 2.4. Under the conditions of Theorem 2.3, γˆ and νˆ are consistent estimators of γ and ν.

6

Proof. The cdf of the minimum, X(n) , of a truncated Pareto random sample is " α  # γ γ α n − ν FX(n) (x) = P {X(n) ≤ x} = 1 − x γ α 1− ν For any small  > 0, P {|X(n) − γ| > } = P {X(n) > γ + }, and consequently   α  n γ γ α − γ+ ν  →0  P {|X(n) − γ| > } =  γ α 1− ν as n → ∞. Thus, γˆ = X(n) converges in probability to γ, and hence γˆ is a consistent estimator of γ. Similarly, we can show that νˆ is a consistent estimator of ν. The cdf of the maximum, X(1) , of a truncated Pareto random sample is FX(1) (x) = P {X(1) ≤ x} =

"

1− 1−

 γ α x  γ α ν

#n

and hence for any small  > 0, P {|X(1) − ν| > } = P {X(1) ≤ ν − } =

"

1− 1−

α #n γ ν−  γ α ν

→0

as n → ∞. Hence, νˆ = X(1) converges in probability to ν.



Now we derive the asymptotic properties of α. ˆ Theorem 2.5. Under the conditions of Theorem 2.3, the MLE α ˆ of α has an asymptotic normal distribution with asymptotic mean α. Proof. For a given γ and ν, suppose that h ≡ h(γ, ν) solves the equation  1 (γ/ν)h ln(γ/ν) 1 X + − − ln γ =0 ln X (i) h 1 − (γ/ν)h n i=1 n

(2.8)

G=

for a fixed γ and ν. Then from Theorem 2.1, α ˜ = h(γ0 , ν0 ) where γ0 and ν0 are the true values of γ and ν, respectively, and by Theorem 2.3, α ˆ = h(ˆ γ , νˆ) where γˆ = X(n) and νˆ = X(1) . By a Taylor series expansion about γ0 and ν0 , (2.9)

γ − γ0 ) + hν (γ0 , ν0 )(ˆ ν − ν0 ) α ˆ = h(ˆ γ , νˆ) ≈ h(γ0 , ν0 ) + hγ (γ0 , ν0 )(ˆ

7

where hγ (γ0 , ν0 ) and hν (γ0 , ν0 ) are the partial derivatives of h with respect to γ and ν, respectively, evaluated at γ0 and ν0 . By Lemma 2.4, the last two terms on the right side goes to zero in probability as n → ∞ provided that hγ (γ0 , ν0 ) and hν (γ0 , ν0 ) are bounded. Applying Slutsky’s Theorem (see for instance [10]), it then follows that α ˆ has asymptotic mean α and converges in distribution to an asymptotic normal distribution. We finish the proof by showing that the partial derivatives hγ (γ0 , ν0 ) and hν (γ0 , ν0 ) are bounded. If (2.8) holds then dG ∂ G ∂G ∂h = + =0 dγ ∂γ ∂h ∂γ

implying

hγ (γ0 , ν0 ) =

∂h −∂ G/∂γ = ∂γ ∂G/∂h

and similarly, hν = (−∂ G/∂ν)/(∂G/∂h). Since neither ∂ G/∂γ nor ∂ G/∂ν have any singularities for 0 < γ < ν and h > 0, both hγ (γ0 , ν0 ) and hν (γ0 , ν0 ) are bounded if we can show that ∂G/∂h 6= 0. Let b = γ/ν. In fact we have (2.10)

∂G −(1 − bh )2 + h2 bh (ln b)2 = < 0 for every 0 < b < 1 and h > 0 ∂h h2 (1 − bh )2

To see this, first note that (2.11)

ex − e−x > 2x for all x > 0.

Now (2.10) is equivalent to h2 bh (ln b)2 < (1 − bh )2 (2.12)

−hbh/2 (ln b) < 1 − bh −h ln b < b−h/2 − bh/2

and the last line is true for all h > 0 using (2.11) where x = (−h/2) ln b.



Remark 2.6. The asymptotic variance of α ˆ is not the same as the asymptotic variance of α ˜ since adjustments need to be made on the asymptotic variance of α ˆ because γ and ν were estimated. Furthermore, the standard theory of obtaining asymptotic variance using the Fisher information matrix is no longer applicable. The next result addresses the issue of existence and uniqueness of the MLE for α, obtained by solving (2.4) in the case where γ and ν are known, or (2.7) in the case where all three parameters are estimated.

8

Theorem 2.7. The probability that a solution to the MLE equation (2.4) or (2.7) exists tends to one as n → ∞, and if it exists, the solution is unique. Proof. First we consider the case where γ and ν are known. In this case the MLE α ˜ =h where h > 0 solves the equation n  1 bh ln b 1 X ln X + − − ln γ =0 (i) h 1 − bh n i=1  P where b = γ/ν ∈ (0, 1). Write M = (1/n) ni=1 ln X(i) − ln γ and compute

(2.13)

(2.14)

G(h) =

lim G(h) =

h→0

− ln b −M 2

and

lim G1 (h) = 0 − M.

h→∞

In view of (2.10) we have that ∂G/∂h < 0 for all h > 0 so that G is monotone. Hence, the solution to G(h) = 0 is unique if it exists, and it exists if and only if M < (− ln b)/2. As n → ∞, we have M → E{ln(X/γ)} in probability by the law of large numbers, where W = ln(X/γ) has a truncated exponential distribution with density g(w) =

α exp{−α w} 1 − exp{−α(− ln b)}

supported on 0 < w < − ln b. Since the density of W is monotone decreasing, a simple geometrical argument shows that the mean of W must lie in the left half of the interval [0, − ln b], and hence EW < −(1/2) ln b. Then P {M < (− ln b)/2} → 1 as n → ∞, so that the MLE α ˜ exists with probability approaching one as n → ∞. In the case where all three parameters are estimated, the MLE α ˆ = h solves  1 bh ln b 1 X + − − ln X ln X =0 (i) (n) h 1 − bh n i=1  P ∈ (0, 1) with probability one. Write M 0 = (1/n) ni=1 ln X(i) − ln X(n) n

(2.15)

G(h) =

where b = X(n) /X(1) and compute (2.16)

lim G(h) =

h→0

− ln b − M0 2

and

lim G1 (h) = 0 − M 0

h→∞

as before. Now (2.10) shows that G is monotone, and hence, the solution α ˆ = h to G(h) = 0 is unique if it exists, and it exists if and only if M 0 = M +ln(γ/X(n) ) < (− ln b)/2. Lemma 2.4 shows that X(n) → γ in probability, and then the continuous mapping theorem implies that ln(γ/X(n) ) → 0 in probability. Since M → E{ln(X/γ)} in probability another application

9

of the continuous mapping theorem shows that M 0 → E{ln(X/γ)} in probability as well, and the same argument as before shows that the MLE α ˆ exists with probability approaching one as n → ∞.



3. MAXIMUM LIKELIHOOD ESTIMATION FOR THE TAIL In many practical applications, a truncated Pareto may be fit to the upper tail of the data, if for sufficiently large x > 0, P (X > x) ≈

γ α (x−α − ν −α ) , 1 − (γ/ν)α

0 < γ ≤ x ≤ ν.

In this case, we estimate the parameters by obtaining the conditional maximum likelihood estimate based on the r + 1 (0 ≤ r < n) largest order statistics representing only the portion of the tail where the truncated Pareto approximation holds. This extends the well-known Hill’s estimator [21, 22] to the case of a truncated Pareto distribution. Theorem 3.1. When X(r) > X(r+1) the conditional maximum likelihood estimator for the parameters of the upper-truncated Pareto in (2.2) based on the r + 1 largest order statistics is given by νˆ = X(1) h αˆ i−1/αˆ 1/α ˆ γˆ = r (X(r+1) ) n − (n − r) X(r+1) /X(1) and α ˆ solves the equation (3.1)

αˆ  r X r X(r+1) /X(1) ln X(r+1) /X(1) r 0= + − [ln X(i) − ln X(r+1) ] αˆ α ˆ 1 − X(r+1) /X(1) i=1

Proof. The proof is similar to that of the Hill’s estimator ([22]). It is convenient to transform the data, taking Z(i) = (X(n−i+1) )−1 so that G(z) = P (Z ≤ z) =

γ α (z α − ν −α ) 1 − (γ/ν)α

and Z(1) ≥ · · · ≥ Z(n) . Since U(i) = G(Z(i) ) are (decreasing) order statistics from a uniform distribution, E(i) = − ln U(i) are (increasing) order statistics from a unit exponential. Following [15] pp. 20–21, we let Yi = (n − i + 1)(E(i) − E(i−1) ), and hence, one can easily check {Yi , i = 1, . . . , n} are independent and identically distributed unit exponential. Define

10

Y ∗ = nE(n−r+1) = n(Y1 /n + · · · + Yn−r+1 /r) so that Yn , . . . , Yn−r+2 , Y ∗ are mutually independent with joint density exp(−y(n) − · · · − y(n−r+2) )p(y ∗ ), where p(y ∗ ) is the density of Y ∗ . Since U(n−r+1) has density K1 ur−1 (1 − u)n−r it follows that Y ∗ = −n ln U(n−r+1) has density p(y) = K2 exp(−y/n)r (1 − exp(y/n))n−r where {Kj , j = 1, 2} are positive constants. Use the fact that α α Y(i) = (n − i + 1)[− ln G(Z(i) ) + ln G(Z(i−1) )] = (n − i + 1)[ln(Z(i−1) − ν −α ) − ln(Z(i) − ν −α )],

for i = n − r + 2, . . . , n, and exp(−Y ∗ /n) = G(Z(n−r+1) ) =

α γ α (Z(n−r+1) − ν −α )

1 − (γ/ν)α

,

to obtain the likelihood conditional on the values of the r + 1 smallest order statistics, Z(n) = z(n) , . . . , Z(n−r+1) = z(n−r+1) , as K

Y r

α−1 αz(n−i+1)



  X r−1 α −α α −α i[ln(z(n−i) − ν ) − ln(z(n−i+1) − ν )] exp −

α z(n−i+1) − ν −α i=1   α α r −α r  α α γ (z(n−r+1) − ν −α ) n−r Y γ (z(n−r+1) − ν ) I{1/ν ≤ z(n−i+1) ≤ 1/γ}, 1− × 1 − (γ/ν)α 1 − (γ/ν)α i=1 i=1

where K > 0 does not depend on the data or the parameters and the first product term is the Jacobian for the transformations defined by Yn , . . . , Yn−r+2 , Y ∗ . Note that r−1 X

α i[ln(z(n−i)

−ν

−α

)−

α ln(z(n−i+1)

i=1

−ν

−α

)=r

α ln(z(n−r+1)

−ν

−α

)−

r X

α ln(z(n−i+1) − ν −α )

i=1

Next, condition on Z(n−r+1) ≤ d < Z(n−r) , which multiplies the conditional likelihood by a factor of

 n−r   α − ν −α ) −(n−r) γ α (z(n−r+1) γ α (dα − ν −α ) . 1− 1− 1 − (γ/ν)α 1 − (γ/ν)α

Then the conditional likelihood function simplifies to  r Y r rα α n−r Y r γ [1 − (γd) ] α−1 Kα z(n−i+1) I{1/ν ≤ z(n−i+1) ≤ 1/γ}. [1 − (γ/ν)α ]n i=1 i=1 Substitute β = (γd)α to obtain  r Y r β r d−rα (1 − β)n−r Y α−1 Kα z I{1/ν ≤ z(n−i+1) ≤ 1/γ}. [1 − β(d ν)−α ]n i=1 (n−i+1) i=1 r

11

In terms of the original data, this conditional likelihood becomes  r Y Y r r r rα n−r Y r β D (1 − β) −α+1 −2 Kα x x(i) I{γ ≤ x(i) ≤ ν}. [1 − β(D/ν)α ]n i=1 (i) i=1 i=1   Qr −2 −1 where d = D and the product term i=1 x(i) is the Jacobian associated with the transQ formations X(i) = (Z(n−i+1) )−1 , for i = 1, . . . , r. Note that ri=1 I{γ ≤ x(i) ≤ ν} if and only if x(1) ≤ ν and x(r) ≥ γ. Consequently, whenever x(1) ≤ ν and x(r) ≥ γ, the conditional log-likelihood is ln L = K0 +r ln α+r ln β +rα ln D +(n−r) ln(1−β)−n ln(1−β(D/ν)α )−(α+1)

r X

ln(x(i) ),

i=1

and we seek the global maximum over the parameter space consisting of all (α, β) for which α > 0 and 0 < β < 1. Since this conditional log-likelihood is a decreasing function of ν for ν ≥ X(1) , the conditional MLE for ν is νˆ = X(1) . Substituting the νˆ for ν and taking partial derivatives of the conditional log-likelihood with respect to α and β, we get nβ(D/X(1) )α ln(D/X(1) ) X ∂ ln L r = + − [ln X(i) − ln D] ∂α α 1 − β(D/X(1) )α i=1 r

(3.2)

n(D/X(1) )α ∂ ln L r n−r = − + ∂β β 1 − β 1 − β(D/X(1) )α

We obtain the MLE for α ˆ and βˆ by setting these partial derivatives to zero and solving. The solution to (3.2) depends on the value of D, which will be unknown in practice. Since we assume X(r) ≥ D > X(r+1) we could estimate D by X(r) or X(r+1) or some point in between. We use the estimate D = X(r+1) to be consistent with standard usage for Hill’s estimator [3, 19, 21, 24, 28, 40, 41], MLE for α in a Pareto. Setting the equations in (3.2) to zero and solving, we find that

h αˆ i−1 βˆ = r n − (n − r) X(r+1) /X(1)

and α ˆ solves the equation αˆ  r X r X(r+1) /X(1) ln X(r+1) /X(1) r 0= + − [ln X(i) − ln X(r+1) ]. αˆ α ˆ 1 − X(r+1) /X(1) i=1 We complete the proof of the theorem by using γ = Dβ 1/α .



12

Remark 3.2. The conditional maximum likelihood estimator αˆT P of α given in Theorem 3.1, is smaller than the Hill’s estimator αˆH of α. This is easy to see after noticing that the second term on the right-hand side of equation (3.1) is always negative. Remark 3.3. In some applications, it may be natural to employ a truncated and possibly shifted exponential distribution. All of the results in this section and the previous section also apply to that case, since Y = ln X has a truncated shifted exponential distribution with

P (Y > y) =

e−α(y−ln γ) − (γ/ν)α 1 − (γ/ν)α

for ln γ ≤ y ≤ ln ν

if and only if X has a truncated Pareto distribution. If γ = 1 then Y has a truncated exponential distribution with no shift. 4. Test for Truncation In this section, we present a simple test to distinguish whether a given data set is more appropriately modeled by a Pareto or truncated Pareto distribution. The truncated Pareto distribution is given by (2.2) with 0 < γ < ν < ∞, which reduces to the Pareto distribution (2.1) in the limit ν = ∞. An asymptotic q-level test (0 < q < 1) for H0 : ν = ∞ (Pareto) versus Ha : ν < ∞ (truncated Pareto) can be constructed using the following simple result from extreme value theory. Lemma 4.1. Suppose that X1 , X2 , . . . , Xn are independent and identically distributed according to a Pareto distribution, and let X(1) ≥ X(2) ≥ . . . ≥ X(n) denote their order statistics. d

Then n−1/α X(1) → Y as n → ∞ where P (Y ≤ t) = exp{−Ct−α }, and C = γ α in (2.1). Proof. Write

P {n

−1/α

X(1) ≤ x} = P {X(1) ≤ n

which proves the lemma.

1/α

n   Cx−α x} = 1 − → exp −Cx−α n 

13

The random variable Y has a max-stable or Type II extreme value distribution. The lemma implies that for large n, the maximum X(1) ≈ n1/α Y and then P (X(1) ≤ x) ≈ P (n1/α Y ≤ x) = P (Y ≤ n−1/α x) = exp{−C(n−1/α x)−α }

(4.1)

= exp{−n C x−α }. Then an asymptotic level q test (0 < q < 1) rejects the null hypothesis H0 : ν = ∞ (Pareto) in favor of the alternative Ha : ν < ∞ (truncated Pareto) if X(1) < xq where q = exp{−n C xq−α }. This test rejects H0 if and only if 1/α  nC (4.2) X(1) < − ln q and the p-value of this test is given by −α p = exp{−n C X(1) }.

(4.3)

In practice we use Hill’s estimator [22] " #−1 r X {ln X(i) − ln X(r+1) } α ˆ H = r−1

and

Cˆ = (r/n)(X(r+1) )αˆ H

i=1

to estimate the parameters C and α. Note that a small p-value in (4.3) indicates that the Pareto model is not a good fit, and this is not enough in itself to indicate goodness of fit of the truncated Pareto distribution. Hence the test should always be supplemented by a graphical check of the data tail, as illustrated in the next section. We performed Monte Carlo simulations to investigate the small sample property of the proposed test for truncation. We generated a random sample from a Pareto distribution with α = 0.5 and C = 1 for varying n ∈ (100, 500, 1000). The values of r were also varied based on the percentage of the total sample size n. In particular, we considered the largest 10%, 30% and 100% order statistics. We performed the proposed test at 5% level of significance for each combination of n and r. We repeated this process 1000 times for each case. The simulated significance level was the proportion of times out of 1000 that the null hypothesis was rejected. The results of these simulations are summarized in Table 1. The proposed test is conservative, i.e., the simulated significance level is smaller than the set significance

14

Table 1. Simulated significance level (in %) of the proposed test for truncation using 5% level of significance for varying sample sizes and percent of largest order statistics used based on 1000 replicates. Pareto random sample of size n were generated with α = 0.5 and C = 1. Percent Sample Size (n) 10% 30% 70% 100% 100

**

0.3

1.8

2.0

500

1.3

3.8

4.1

4.9

1000

3.3

4.7

5.2

4.8

level. This means that for small sample sizes it will be more difficult to reject the Pareto assumption in favor of the truncated Pareto. However, the simulated significance level gets closer to 5% as n and r increase as expected from the asymptotic properties of the maximum under the Pareto distribution. Simulation results are exactly the same for any values of α and C, when using the same values of n and r and the same random number seed. To understand why this is the case, note that we simulate Xi = (C/Ui )1/α where Ui are iid uniform (0, 1) random variables, and hence X(i) = (C/U(i) )1/α . The test rejects the null hypothesis if nCˆH − ln q

X(1)

Suggest Documents