New representations for the conditional approach to multivariate

0 downloads 0 Views 951KB Size Report
Feb 9, 2014 - marginal distribution functions FX and FY are standard Gumbel distributed, ..... the form â(x)+b(x)zp, where â(x), b(x) are maximum likelihood ...
New representations for the conditional approach to multivariate extremes Ioannis Papastathopoulos* and Jonathan A. Tawn† *

arXiv:1402.1908v1 [math.PR] 9 Feb 2014



School of Mathematics, University of Bristol Department of Mathematics and Statistics, Lancaster University [email protected] [email protected]

Abstract The conditional approach to multivariate extremes concerns the characterization of the limiting distribution of appropriately normalized random vectors given that at least one of their components is large. The statistical methods for the conditional approach are based on a parametric family of location and scale norming functions proposed by Heffernan and Tawn (2004). Recently, inverted max-stable processes have been proposed as an important new class for spatial extremes covering asymptotic independence in contrast to max-stable processes which are asymptotically dependent. We study a broad range of inverted maxstable processes and present examples where the normalizations required for non-degenerate conditional limit laws do not belong to the parametric family identified by Heffernan and Tawn. Despite such differences at an asymptotic level, we show that at practical levels, the model of Heffernan and Tawn approximates well the true conditional distributions.

Key-words: Asymptotic independence; Brown–Resnick process; conditional dependence; extremal Gaussian process; H¨ usler–Reiss copula; inverted max-stable distribution

1

Introduction

Let (X, Y ) be a bivariate random variable and assume, without loss of generality, that the marginal distribution functions FX and FY are standard Gumbel distributed, i.e., FX (x) = FY (x) = exp {− exp (−x)}, for x ∈ R. We focus for simplicity on the bivariate representation and leave multivariate analogues for subsequent investigations. The conditional approach of Heffernan and Tawn (2004), formalised later by Heffernan and Resnick (2007), concerns the characterisation of the limiting distribution of Y | X > u, as u → ∞. Specifically, the conditional dependence model of Heffernan and Tawn (2004) is motivated by the relatively weak assumption that there exist location and scaling norming functions a : R+ → R and b : R+ → R+ , respectively, such that, for any x > 0 and z ∈ R, lim Pr {X − u > x, {Y − a(X)}/b(X) < z | X > u} = exp(−x)G(z),

u→∞

(1)

where G is a non-degenerate distribution function. To ensure G is well-defined, the condition lim G(z) = 1,

z→∞

is required, so that G places no mass at +∞ but some mass is allowed at −∞.

1

Formulation (1) is due to Heffernan and Resnick (2007) and differs from the Heffernan and Tawn (2004) formulation in the way that the latter is based only on suitably defined regular conditional distributions. However, under conditions on the smoothness of the joint density of (X, Y ) given by Wadsworth et al. (2014), the following limits are equivalent     Y − a(X) Y − a(x) lim Pr < z X > u = G(z) ⇔ lim Pr < z X = x = G(z). (2) u→∞ x→∞ b(X) b(x) The right-hand side statement of expression (2) allows the development of useful statistical models and we use it throughout the paper. For positively dependent random variables, Heffernan and Tawn (2004) found that, for all the standard copula models studied by Joe (1997) and Nelsen (1999), the norming functions a(x) and b(x), fell into the simple parametric class a(x) = αx and b(x) = xβ ,

(3)

where α ∈ [0, 1] and β ∈ (−∞, 1). The case α = 1 and β = 0 corresponds to asymptotic dependence of X and Y , i.e., the coefficient  χ = lim Pr Y > FY−1 (p) | X > FX−1 (p) , (4) p→1

is positive, whereas any other combination of α and β corresponds to asymptotic independence of X and Y , i.e., χ is zero. Model class (3) has been subject of criticism (Smith, 2004) since the functions a and b seem to be ‘proof by example’ rather than a general result. Heffernan and Resnick (2007) note that a and b are regularly varying but no examples are known to date other than the canonical form (3). The statistical model of Heffernan and Tawn (2004) assumes the parametric form (3) for the functions a and b, and is based on the approximation that the limiting relationship (1) holds exactly for all values X > u for a suitably high threshold u. The model has been found to be flexible in large scale applications of multivariate extremes (Latham, 2006; Paulo et al., 2006; Keef et al., 2009; Hilal et al., 2011; Eastoe and Tawn, 2012). Keef et al. (2013) proposed changes to the formulation of Heffernan and Tawn (2004), specifically for (X, Y ) to have standard Laplace margins which, due to the symmetry of the Laplace distribution, parsimoniously extends model class (3) to −1 ≤ α ≤ 1 and accounts for negatively dependent random variables when α ∈ [−1, 0). Spatial extremes is an area of recent activity which lies at the intersection of extreme value theory and geostatistics, and concerns the areal modelling of extremes of spatial processes such as rainfall, river flow and temperature (Davison et al., 2012). Due to the fact that max-stable processes arise as the only non-trivial limits of point-wise maxima of appropriately normalized processes (de Haan, 1984), most models proposed for modelling spatial extremes are maxstable (Smith, 1990; Schlather, 2002; Kabluchko et al., 2009; Davison and Gholamrezaee, 2012). The most widely used max-stable processes are the Brown-Resnick (Brown and Resnick, 1977; Kabluchko et al., 2009), the extremal Gaussian (Schlather, 2002) and the extremal-t (Demarta and Mcneil, 2005; Nikoloulopoulos et al., 2009) max-stable processes. However, the major weakness with max-stable processes for modelling spatial dependence in extreme events is that they permit only asymptotic dependence or exact independence at any separation between sites. In contrast, data are often found to exhibit asymptotic independence. Wadsworth and Tawn (2012) motivated the class of inverted max-stable distributions and showed that any max-stable process can be converted to a viable model for spatial asymptotic independence. We show that for inverted max-stable models parameterized by spectral densities that are regularly varying at their end points, the normalization required is equal to model class (3), up to 2

the inclusion of a slowly varying function. It is shown that model class (3) also contains inverted max-stable models that place mass on the lower end point of their spectral measure, including the inverted Schlather and extremal-t (Demarta and Mcneil, 2005; Nikoloulopoulos et al., 2009). Additionally, we show that the normalization required for a non-degenerate limit law G does not belong to the family of norming functions (3) when (X, Y ) follows the inverted H¨ usler–Reiss copula. This feature, identified by Γ-variation (de Haan, 1970) at the lower end-point of the spectral density, is also illustrated with one new, closely related example. Last, we compare the new models with the current statistical approach of Heffernan and Tawn (2004) based on the canonical family (3) and show, through simulation, that at practical levels, model (3) approximates well the conditional distribution of the inverted H¨ usler–Reiss. The paper is structured as follows. In Section 2 we present the classes of max-stable and inverted max-stable distributions and the conditional extremes results of Heffernan and Tawn (2004) for a family of inverted max-stable distributions. In Sections 3.1 and 3.3, we present the conditional representation of the class of inverted max-stable distributions parameterized over regularly varying and Γ-varying spectral densities at their end-points, respectively. In Section 3.2, we present the conditional representation of the inverted H¨ usler–Reiss distribution. In Section 3.4 an additional example outside the class of inverted max-stable distributions with Γ-varying conditional survivor is illustrated. Our derivations and proofs are included in the Appendix. For the rest of this paper, we refer to a function h : R+ → R+ as regularly varying at 0, with index t ∈ R, short-hand h ∈ Rt (0+ ) if, for all w > 0, lim h(sw)/h(s) = wt .

s→0+

For any h ∈ Rt (0+ ), it follows that for all w > 0, h(w) = wt L(w), where L is a slowly varying function, i.e., L ∈ R0 (0+ ). Also, we refer to a function h as Γ-varying at 0 with auxiliary function f , short-hand h ∈ Γf (0+ ), if for all w > 0, lim h {s + wf (s)} /h(s) = exp(w).

s→0+

2 2.1

Bivariate inverted max-stable distributions Max-stable and inverted max-stable distributions

Max-stable distributions arise naturally as the only non-degenerate limit distributions of appropriately normalized component-wise maxima of random vectors. In standard Gumbel margins, and for x, y ∈ R, a bivariate max-stable distribution function is defined by  Z 1  F (x, y) = exp [−V {exp(x), exp(y)}] = exp − max {w exp(−x), (1 − w) exp(−y)} dH(w) , 0

(5) where V is termed the exponent measure and H is an arbitrary finite measure on [0, 1], known as the spectral measure, with total mass 2, satisfying the marginal moment constraint Z 1 w dH(w) = 1. (6) 0

Let C = {{0}, {1}, (0, 1)} be a partition of the set [0, 1]. Coles and Tawn (1991) showed that if V is differentiable, then H(w) has spectral density h(w) on the interior (0, 1) and can have

3

mass H({k}), k = 0, 1, on each of {0} and {1}, given by h(w) = −

∂2V (w, 1 − w) ∂x∂y

0 < w < 1, (7)

∂V H({0}) = −y 2 lim (x, y), x→0 ∂y

∂V and H({1}) = −x2 lim (x, y). y→0 ∂x

As the class of bivariate max-stable distributions does not admit a finite dimensional parameterisation, a natural method for modelling the spectral measure H of expression (5) relies on constructing parametric sub-classes of models that are flexible enough to approximate any member from the class (Coles and Tawn, 1991; Ballani and Schlather, 2011). Two such sub-models are H¨ usler and Reiss (1989) and Schlather (2002) max-stable distributions which have exponent measures, for x, y > 0,       1 1 λ 1 y λ 1 x V (x, y) = Φ + Φ λ ∈ (0, ∞), (8) + log + log x 2 λ x y 2 λ y  "  1/2 # 1 1 1 xy V (x, y) = 1 + 1 − 2 (1 + ρ) ρ ∈ (−1, 1), (9) + 2 x y (x + y)2 respectively, where Φ is the cumulative distribution function of the standard normal distribution. The parameters λ and ρ control the strength of dependence. In particular, increasing and decreasing values of ρ and λ, respectively, imply stronger dependence between X and Y . Given a max-stable distribution with exponent measure V as in equation (5), the bivariate random variable (X, Y ) follows the inverted max-stable distribution with Gumbel margins if, for x, y ∈ R, its joint survivor function is, Pr (X > x, Y > y) = exp [−V {1/T (x) , 1/T (y)}] ,

(10)

where T (x) = − log [1 − exp {− exp (−x)}] = x + O{exp(−x)}, as x → ∞. The class of inverted max-stable distributions permits asymptotic independence or exact perfect dependence between the random variables X and Y , a feature explained by the coefficient of tail dependence η (Ledford and Tawn, 1997), i.e., . Pr (X > log x, Y > log x) = x−1/η , η = 1/V (1, 1) ∈ [1/2, 1], as x → ∞, with η = 1 only when V (1, 1) = 1, i.e., perfect dependence of X and Y corresponding to asymptotic dependence. For all other values of V (1, 1) > 1 then η < 1 and X and Y are asymptotically independent.

2.2

Conditional representations of inverted max-stable distributions

Assume that (X, Y ) follows an inverted max-stable distribution with Gumbel margins and corresponding exponent measure V . Assuming y grows with x, with y/x → 0, as x → ∞ and differentiable exponent measure V , the conditional survivor function is approximately equal to . Pr (Y > y | X = x) = −Vx {1, x/T (y)} exp [x − xV {1, x/T (y)}] y ∈ R, (11) where Vx (x, y) = ∂V (x, y)/∂x. Heffernan and Tawn (2004) explored the conditional representation (1) for the class of inverted max-stable distributions subject to the assumption that H places all the mass in (0, 1] and that the spectral density satisfies h(w) ∼ L(w − w1 )(w − w1 )t 4

as w → w1 ∈ [0, 1/2),

(12)

for w1 = 0 and L(w) slowly varying at 0 with limw→w+ L(w − w1 ) = s > 0 and t > −1. Under 1 this setting, the normalization (3) required to give a non-degenerate limiting conditional law has α = 0, β = (t + 1)/(t + 2) and the limit is of Weibull type, i.e.,     sz t+2 β lim Pr Y < X z | X > u = 1 − exp − , for z > 0. (13) u→∞ (t + 1)(t + 2)

3 3.1

New representations Regular variation at lower tail of spectral measure

In this section, we extend the conditional representation (11) for the class of inverted max-stable distributions. In particular, we explore model (12) and the effect on the normalizing functions a(x) and b(x) when the spectral measure H places its mass in a sub-region, [w1 , wf ] say, of [0, 1]. Motivated by the Schlather distribution (9), for which the spectral measure places mass at {0}, we also explore the assumption of possible mass on the lower end point w1 of H, a concept made explicit in Lemma 1. Lemma 1. Let w1 , wf , f ∈ N+ , be the lower and upper end points, respectively, of the spectral measure H of an inverted max-stable distribution (10), i.e., w1 = inf {0 ≤ w < 1/2 : H(w) ≥ 0} ,

wf = sup {1/2 < w ≤ 1 : H(w) ≤ 2} ,

and assume that, apart from a countable set D = {2, . . . , `}, ` ∈ N, w` ≤ wf , of points for which H({wi }) > 0 for i ∈ D and H({w1 }) ≥ 0, the spectral measure is absolutely continuous with respect to the Lebesgue measure. Then, if T (y)/{x + T (y)} → w1 , for y = y(x) a function of x and T : R → R defined in expression (10), as x → ∞, Vx {1, x/T (y)} → w1 H({w1 }) − 1.

(14)

In Lemma 1 the case of perfect positive dependence between X and Y , i.e., w1 = wf = 1/2, is excluded since there can be no possible normalization such that G in expression (1) is nondegenerate. Proposition 1 gives the approximate log-conditional survivor function of Y | X = x. Proposition 1. Under the conditions of Lemma 1 and for h(w) as in expression (12), as x → ∞, log {Pr (Y > y | X = x)} is approximately equal to   T (y) log {1 − w1 H({w1 })} − {x + T (y)} − w1 H({w1 }) x + T (y)   t+2 x + T (y) T (y) T (y) − L − w1 − w1 . (15) (t + 1)(t + 2) x + T (y) x + T (y) Approximation (15) consists of three additive terms, the first two are contributions from the mass H({w1 }) on the lower end point w1 , and the last from the functional behaviour of the absolutely continuous component h near w1 . In Corollary 1, the log-conditional survivor is categorised according to the two exclusive cases of positive mass and zero mass on the lower end point w1 , of H. Corollary 1. For T (y)/{x+T (y)} → w1 ∈ [0, 1/2) as x → ∞, we obtain that log Pr (Y > y | X = x) is approximately equal to (i) for H({w1 }) = 0,  t+2   y y − w1 − w1 {(1 − w1 )(t + 1)(t + 2)}, (16) −xL x+y x+y 5

(ii) for H({w1 }) > 0,  log {1 − w1 H({w1 })} − {x + T (y)}

 T (y) − w1 H({w1 }). x + T (y)

(17)

As there is no contribution from the spectral density in expression (17), a general form for the normalization can be obtained directly. General forms of the normalizing functions a(x) and b(x) cannot be obtained from representation (16) so it is helpful to consider an additional assumption for the slowly varying function L. Heffernan and Tawn (2004) results were based on the case of L having a finite right limit at 0, Corollary 2 extends this. Corollary 2. Under the conditions of Proposition 1, we obtain the normalizing functions and limiting conditional distributions for the cases H({w1 }) = 0 and H({w1 }) > 0, respectively.  −1/(t+2) for x > 0 and (i) Let a(x) = {w1 /(1 − w1 )}x and b(x) = x(t+1)/(t+2) L x−1/(t+2) assume that for all τ ∈ (0, 1), L {wL(w)−τ } lim = 1. (18) L(w) w→0+ Then as u → ∞, (1 − w1 )3+2t z t+2 Pr {Y < a(X) + b(X)z | X > u} → 1 − exp − (t + 1)(t + 2) 

 for z > 0.

(ii) Let a(x) = {w1 /(1−w1 )}x and b(x) = 1 for x > 0. Then as u → ∞, Pr {Y < a(X) + b(X)z | X > u} converges to ( 1 − [1 − exp{− exp(−z)}]H({w1 }) , if w1 = 0 1 − {1 − w1 H({w1 })} exp {−(1 − w1 )H({w1 })z} , if w1 > 0, for z ∈ R and z > log {1 − w1 H({w1 })} /{(1 − w1 )H({w1 })}, respectively. Condition (18) is satisfied by a range of slowly varying functions, including those studied by Heffernan and Tawn (2004) as well as by functions that approach ∞ when the argument tends to zero. Examples for L(w), with limw→0+ L(w) = ∞, satisfying condition (18) include logκ (− log w), κ ∈ N0 , exp {(− log w)ν }, ν ∈ (0, 1) and exp {− log w/ log (− log w)}, where logk is the iterated logarithm function defined recursively by logk x = logκ−1 log x, log0 x = x and log1 x = log x. Remark 1. For limw→0+ L(w) = s > 0, all cases of norming functions (i)-(ii) in Corollary 2 reduce, after absorbing s into the limiting law, to the parametric class of Heffernan–Tawn, i.e., a(x) + b(x)z = αx + xβ z, where α = w1 /(1 − w1 ) ∈ [0, 1) for w1 ∈ [0, 1/2) and β = (t + 1)/(t + 2) ∈ [0, 1) for t ≥ −1. This is the first example to be known with both α and β a function of the parameters. Additionally, the same result holds for the canonical regularly varying function L(w − w1 )(w − w1 )t = (w − w1 )t , i.e., when s = 1. Remark 2. When limw→0+ L(w) = ∞ and subject to condition (18), the additional factor L{x−1/(t+2) }−1/(t+2) enters in the scaling function and reduces the rate of increase of b(x) to ∞, as x → ∞. Another interesting case which is satisfied by many max-stable models that appear in the literature, is when w1 = 0 and H({w1 }) > 0 in case (ii) of Corollary 2. In this case, a(x) = 0 and b(x) = 1 for all x > 0 and the random variables X and Y , conditionally on X > u, are nearly independent in the limit u → ∞, with exact independence occurring when the limit 6

distribution is standard Gumbel, i.e., when H({w1 }) = 1. Two such max-stable models come from the extremal-t (Nikoloulopoulos et al., 2009) and Gaussian-Gaussian (Wadsworth and Tawn, 2012) processes, for which the exponent measures of their bivariate distributions are # # " " 1 1 (y/x)1/ν − ρ (x/y)1/ν − ρ + Tν+1 , (19) V (x, y) = Tν+1 x y {(1 − ρ2 )/(ν + 1)}1/2 {(1 − ρ2 )/(ν + 1)}1/2 )1/2   Z ( 1 1 1 1 φ2 (u)2 φ2 (u) φ2 (h − u) φ2 (h − u)2 V (x, y) = + − 2ρ(h) + + du,(20) 2 x y 2 R2 x2 xy y2 respectively, where ν > 0, h ∈ R2+ , ρ(h) ∈ [−1, 1] is a valid correlation function, Tν+1 is the distribution function of the standard-t distribution with ν + 1 degrees of freedom, and φ2 is the density of the standard bivariate normal distribution with correlation ρ(h). The corresponding mass on the lower end point w1 = 0 of models (19) and (20) is "   # ν + 1 1/2 1 − ρ(h) H({0}) = Tν+1 −ρ and H({0}) = , 2 1−ρ 2 respectively. Table 1 gives a collection of other max-stable models, including the Schlather distribution (9), placing positive mass on {0}. Table 1: The mass of the spectral measure on {0} of bivariate exponent measures, from top to bottom, of mixed, asymmetric and asymmetric mixed logistic (Tawn, 1988), Schlather (Schlather, 2002) and Marshall and Olkin (1967) distributions. The final column shows the parameter space, Θ, of the model. H({0}) Θ  V (x,  y) 1 x

θ + 1 − x+y n y oα 1/α 1/α 1−φ 1−θ + + (θ/x) + (φ/y) x y  −2   2φ+θ θ+φ 1 1 1 1 1 + − + + x y xy  x y x y o1/2    n 1 1 1 1 + 1 − 2(1+ρ)xy 2 x + y (x+y)2   α x1 + y1 + (1 − α) max {1/x, 1/y}

3.2

1−θ

θ ∈ (0, 1)

1−φ

0 ≤ θ, φ, α ≤ 1

1−φ−θ

θ, θ + 3φ > 0 and θ + φ, θ + 2φ ≤ 1

(1 − ρ) /2

ρ ∈ (−1, 1)

α

0≤α≤1

Inverted H¨ usler–Reiss distribution

In this section, we focus on the limiting conditional representation of the inverted H¨ usler–Reiss distribution (8). The spectral measure of the H¨ usler–Reiss distribution places no mass on any 0 ≤ w ≤ 1, and the spectral density satisfies n o exp(−λ2 /8) −3/2 2 2 h(w) ∼ w exp − (log w) /(2λ ) as w → 0. (21) λ(2π)1/2 This corresponds to a different form than expression (12) or its more general forms of the slowly varying function L. In particular, the spectral density is Γ-varying at 0 with auxiliary function f (w) = −λ2 w/ log w. As Proposition 2 shows, this example leads to a different form for the normalizing functions a(x) and b(x) than the ones considered by Heffernan and Tawn (2004). 7

Proposition 2. Assume that (X, Y ) follows the inverted max-stable distribution (10) with exponent measure (8). For x > 0, define the norming functions   λ2 λ log log x 1/2 + and b(x) = a(x)/(log x)1/2 . (22) a(x) = x exp −λ(2 log x) + 2 (2 log x)1/2 Then, for any z ∈ R,    n √ o Y − a(X) λ lim Pr exp − 2z/λ . < z | X > u = 1 − exp − u→∞ b(X) (8π)1/2

(23)

−1 FX (p1 )

1 0

0

a(x)/x

log b(x)/ log x

1

Remark 3. Limit distribution (23) is of reverted Gumbel type, which is different from the limits in Corollary 2. The rate of convergence to the limit is order log log u/(log u)1/2 .

−1 FX (p2 )

−1 FX (p1 )

−1 FX (p3 )

x

−1 FX (p2 )

−1 FX (p3 )

x

Figure 1: Plots of a(x)/x and log b(x)/ log x, x > FX−1 (0.87), where a(x) and b(x) are given by expression (22), for different values of λ ranging from 0.01 (bottom curves) to 20 (top curves). The inverse of the standard Gumbel distribution function is FX−1 (p) = − log{− log(p)}, for p ∈ (0, 1). The values of p1 , p2 and p3 are 0.95, 1 − 10−7 and 1 − 10−13 , respectively. A natural question that arises from this counter-example relates to how well can the canonical model (3) approximate the conditional distribution of Y | X > u, for large u, when the random vector (X, Y ) follows the inverted H¨ usler–Reiss distribution with Gumbel margins. To facilitate comparisons between the two models, Figure 1 shows the graphs of a(x)/x and log b(x)/ log x where a(x) and b(x) are given by expression (22), for several values of the dependence parameter λ and a range of x values above the 0.87 standard Gumbel quantile. Both plots show that a(x)/x and log b(x)/ log x are approximately constant for large x so that the canonical class of norming functions is likely to approximate well a(x) and b(x) by αx and xβ , respectively. Subsequently, we simulated data from the inverted H¨ usler–Reiss distribution and fitted the conditional dependence model of Heffernan and Tawn (2004) using: i) the current model (3) and ii) the model implied by the norming functions of the inverted H¨ usler–Reiss distribution, treating the functions (22) as a parametric model for the growth of Y given large X. Our comparisons are based on the differences between the conditional quantile estimates of Y | X = x from the two models. 8

10 0

5

Y

10 5

Y

0

4

6

8

10

12

4

6

8

10

12

8

10

12

10

Y 5 0

0

5

Y

10

15

X

15

X

4

6

8

10

12

4

X

6

X

Figure 2: Conditional exceedances above the 0.935 standard Gumbel quantile from a simulated sample of size 103 from the inverted H¨ usler–Reiss distribution with Gumbel margins for λ = 1.3 (top) and λ = 0.3 (bottom). The black lines correspond to averaged estimates from 100 simulations of the 0.025, 0.5 and 0.975 conditional quantiles of Y | X = x, x > FX−1 (0.935), using the Heffernan–Tawn model (3) (left) and the model constructed by the theoretical functions in expression (22) (right). Grey lines correspond to the theoretical 0.025, 0.5 and 0.975 conditional quantiles. For both models, similar to Heffernan and Tawn (2004), we used for the limiting law G in expression (1) the false working assumption of a normal distribution with mean and variance parameters. We considered two values for the dependence parameter, i.e., λ = 1.3 (weak dependence) and λ = 0.3 (strong dependence). For each λ, 102 samples of size 103 were generated from the inverted H¨ usler-Reiss distribution and the 0.025, 0.5 and 0.975 conditional quantile estimates of Y | X = x, for x > FX−1 (0.935), were computed from the two model fits, i.e., model (3) and the model defined by expression (22). The conditional quantile estimates are of the form a ˆ(x) + ˆb(x)ˆ zp , where a ˆ(x), ˆb(x) are maximum likelihood estimates and zˆp is the p-th empirical quantile of Zˆ = {Y − a ˆ(x)}/ˆb(x), for large x. Figure 2 shows the averaged estimates of the conditional quantiles along with the theoretical conditional quantiles. Both models estimate the true conditional quantiles well and their behaviour is almost indistinguishable. This shows that the canonical model is flexible enough to approximate the conditional distribution of the inverted H¨ usler-Reiss distribution.

3.3

Γ-variation at lower tail of spectral density

Having identified a new form for the tail of the spectral density, we consider in this section the approximate log-conditional survivor function of Y | X = x, under the assumption h(w) ∼ g(w − w1 ),

as w → w1 ,

(24)

where g(w) ∈ Γf (0+ ). Similar to Section 3.1, we consider the assumption of possible mass at the lower end point w1 . Our findings are based on the assumptions of a differentiable spectral 9

density h and Lemma 2. Lemma 2. Let g : R+ → R+ ∈ Γf (0+ ) and U (w) ∈ Rν (0+ ), ν ∈ R. Assume further that there exists an  > 0 such that U and g are C ∞ (0, ) functions with Z g 0 (w) w lim g(s) ds (25) w→0+ g(w)2 0 existing. Then (i) U (w)g(w) ∈ Γf (0+ ). (ii) Define f (w) = g(w)/g 0 (w), w > 0. Then, f is an auxiliary function for g. (iii) For f (w) as in (ii), Z w   (U f )0 (w) {(U f )0 f }0 (w) U (s)g(s)ds {U (w)f (w)g(w)} = 1 − + − ··· U (w) U (w) 0 = 1 + o(1),

as w → 0+ .

(26) (27)

Proposition 3 gives the approximate log-conditional survivor function. Proposition 3. For h(w) as in expression (24) and under the conditions of Lemma 1 and Lemma 2, log {Pr (Y > y | X = x)} is approximately equal to   T (y) log {1 − w1 H({w1 })} − {x + T (y)} − w1 H({w1 }) x + T (y)     T (y) T (y) 2 − {x + T (y)} f − w1 h . (28) x + T (y) x + T (y) As an example, we explore a new class of spectral densities that are more flexible than the spectral density (21) of the H¨ usler–Reiss distribution. Specifically, consider for γ > 0, δ ∈ R and κ > 0, the Γf (0+ )-varying function  as w → 0+ , (29) h(w) ∼ wδ exp −κw−γ with auxiliary function

f (w) = (κγ)−1 w1+γ .

(30)

Proposition 4 gives the normalizing functions and limiting conditional distribution for this example. Proposition 4. Assume that (X, Y ) follows the inverted max-stable distribution (10) with spectral measure H with w1 = 0 and H({0}) = 0, and spectral density h that satisfies expression (29). For x > 0, define the norming functions " # −1 log x log κ a(x) = xκ1/γ (log x)−1/γ 1 + γ −2 {δ + 2(1 + γ)} and b(x) = x (log x)−1−1/γ . log x (31) Then, for any z ∈ R, n  o lim Pr {Y < a(X) + b(X)z | X > u} = 1 − exp − (κγ)−2 exp γκ−1/γ z .

u→∞

(32)

Remark 4. Similarly with the inverted H¨ usler–Reiss distribution, the limiting conditional distribution is of reverted Gumbel type and the norming functions (31) do not belong to the HeffernanTawn parametric family. The rate of convergence to the limit is order log log u/ log u.

10

2 0

1

a(x)/x

2 1

a(x)/x

0 −1 FX (p1 )

−1 FX (p2 )

−1 FX (p1 )

−1 FX (p3 )

−1 FX (p3 )

x

0

−20

−15

−10

log b(x)/ log x

1

a(x)/x

−5

0

2

x

−1 FX (p2 )

−1 FX (p1 )

−1 FX (p2 )

−1 FX (p1 )

−1 FX (p3 )

x

−1 FX (p2 )

−1 FX (p3 )

x

Figure 3: Plots of a(x)/x for different values of the parameters γ (top left), κ (top right) and δ (bottom left) and log b(x)/ log x (bottom right), x > FX−1 (0.87). The inverse of the standard Gumbel distribution function is FX−1 (p) = − log{− log(p)}, for p ∈ (0, 1). The values of p1 , p2 and p3 are 0.95, 1 − 10−7 and 1 − 10−13 , respectively. Figure 3 shows the graphs of functions a(x)/x and log b(x)/ log x, for several values of the parameters γ, δ, κ, and large x. Large values of γ correspond to strong dependence between X and Y so that a(x)/x is nearly constant and equal to 1 for large x. Small values of γ correspond to independence with sharp decrease of a(x)/x to 0 as x increases. Intermediate values of γ correspond to mild-moderate dependence of X and Y with a(x)/x having a turning point and decaying as x increases. Parameters κ and δ, seem to have similar effect with larger values corresponding to increasing dependence. Last, log b(x)/ log x is approximately constant with large x. Comparing with the canonical model (3), the degree of approximation of a(x) and b(x) 11

by αx and xβ is not especially good for a(x). To our knowledge, there is no current parametric model for the max-stable process with spectral density satisfying expression (29).

3.4

Γ-variation outside the class of inverted max-stable distributions

Let θ, θx , θy , θxy > 0 and consider the joint density and conditional survivor for the negatively dependent pair of random variables (X, Y ), given by, for x, y > 0 fX,Y (x, y) = exp {θ − θx x − θy y − θxy xy} Pr (Y > y | X = x) = exp {− (θy + θxy x) y} ,

(33) (34)

respectively. It is readily verified by the von Mises condition (Leadbetter, 1983), that the distribution functions of X and Y are in the domain of attraction of the Gumbel distribution, so that the upper tail is exponential. However, conditionally on X > u and for sufficiently large u, all the probability mass of Y is concentrated on the lower end-point of its distribution, i.e., 0. To ensure an exponential lower tail for Y (Heffernan and Resnick, 2007), we transform the lower tail of Y to the lower tail of the standard Laplace distribution, i.e., exp(y)/2, for y < 0. Proposition 5 gives the norming functions and limiting conditional law for model (33) and illustrates, although for negative dependence, the same feature encountered in the models studied in Sections 3.2 and 3.3. Proposition 5. Assume that (X, Y ) has joint density (33) and let g(u) = log(2u), for u < 1/2. For x > 0, define the norming functions a(x) = − log x and b(x) = 1. Then lim Pr [g {FY (Y )} < a(X) + b(X)z | X > u] = 1 − exp {−θx θxy exp (z − θ) /2} ,

u→∞

for z ∈ R. The limiting conditional law is of reverted Gumbel type and the scaling function a(x) does not fall in model class (3). Finally, note the canonical function, −αx, α ∈ [−1, 0), for negative dependence with Laplace margins (Keef et al., 2013), does not approximate well a(x).

Acknowledgments Ioannis Papastathopoulos acknowledges funding from the SuSTaIn program - Engineering and Physical Sciences Research Council grant EP/D063485/1 - at the Department of Mathematics of the University of Bristol.

12

4

Appendix

4.1

Proof of Lemma 1

For any point w1 ≤ w∗ < wf , there exists an i ∈ E = {1} ∪ D so that E can be decomposed into E = E≤w∗ ∪ E>w∗ , where E≤w∗ = {1, . . . , i}, E>w∗ = {i + 1, . . . , f }, and wi ≤ w∗ < wi+1 . For s/(s + t) ∈ [w1 , wf ] \ E, the partial derivative of the exponent measure is equal to "Z # Z s wf s+t ∂V (s, t) ∂ = (w/s)dH(w) + {(1 − w)/t}dH(w) s ∂s ∂s w 1 s+t   Z Z ∂  (w/s)h(w)dw + {(1 − w)/t}h(w)dw = s ∂s [ s ,wf ]\E> s s w , \E [ 1 ] ≤ s+t

X



j∈E>

[

s+t

(wj /s2 )H({wj })

s s+t

Z =−

s+t

s+t

s ,wf s+t

(w/s2 )h(w)dw −

s ]\E> s+t

X j∈E>

(wj /s2 )H({wj }),

s s+t

which, under the assumption of T (y)/{x + T (y)} → w1 , as x → ∞, yields Z X wh(w)dw − wj H({wj }), Vx {1, x/T (y)} → − [w1 ,wf ]\E>w1 j∈E>w1

(35)

as x → ∞. Using the moment constraint (6) we have that Z Z X wh(w)dw = wh(w)dw = 1 − wj H({wj }), [w1 ,wf ]\E [w1 ,wf ]\E>w1 j∈E which yields equation (14), after combining with equation (35).

4.2

Proof of Proposition 1

For the sake of simplicity we derive the result for the case D = ∅ but note that the proof for non-empty D follows in a similar way. Working similarly to the proof of Lemma 1, we get, after combining equations (11) and (14), that for c(x, y) = T (y)/{x + T (y)}, d(x, y) = {x + T (y)}/{(1 − w1 )T (y) − w1 x}, the log-conditional survivor, log {Pr (Y > y | X = x)}, is approximately equal to log {1 − w1 H({w1 })} + {w1 x − (1 − w1 )T (y)} H({w1 }) Z

c(x,y) t

+ {x + T (y)}

Z

c(x,y)

wL(w − w1 )(w − w1 ) dw − T (y) w1

L(w − w1 )(w − w1 )t dw

w1

= log {1 − w1 H({w1 })} + {w1 x − (1 − w1 )T (y)} H({w1 }) Z



+ {x + T (y)}

L(1/s)s

−(t+3)

Z



ds + {w1 x − (1 − w1 )T (y)}

d(x,y)

d(x,y)

13

L(1/s)s−(t+2) ds.

For t > −1, d(x, y) → ∞, c(x, y) → w1 , as x → ∞, we have, from Karamata’s theorem (Resnick, 1987, pg. 17), that the last expression is approximately equal to log {1 − w1 H({w1 })} + {w1 x − (1 − w1 )T (y)} H({w1 }) +

{x + T (y)} {w1 x − (1 − w1 )T (y)} {d(x, y)}−(t+2) L{1/d(x, y)} + {d(x, y)}−(t+1) L{1/d(x, y)}, (t + 2) (t + 1)

as x → ∞, which simplifies to expresion (15).

4.3

Proof of Proposition 2

Let φ be the probability density function of the standard normal distribution. Assuming y → ∞ as x → ∞ with y/x → 0, we have from expression (11), Lemma 1 and Mill’s ratio, that for large x, the log-conditional survivor, log Pr (Y > y | X = x), is approximately equal to "  #  φ λ2 + λ1 log(x/y) φ λ2 − λ1 log(x/y) x−x 1− λ 1 +y λ 1 2 + λ log(x/y) 2 − λ log(x/y) " λ 1  # λ 1 φ 2 + λ log(x/y) y φ 2 − λ log(x/y) λ2 + λ1 log(x/y)   1+ =x λ 1 x φ λ2 + λ1 log(x/y) λ2 − λ1 log(x/y) 2 + λ log(x/y)  " λ 1 # φ λ2 + λ1 log(x/y) + log(x/y) =x λ 1 1 +  λ2 λ1 + log(x/y) 2 λ 2 − λ log(x/y) h 1 n oi . −1 1/2 φ λ log(y/x) 1 + O (log x) , (36) = −c (xy) 1 2 log(y/x) λ  where c = λ exp −λ2 /8 . Now, let z ∈ R and y = a(x) + b(x)z, where a(x) and b(x) are given by equations (22). We have, as x → ∞, ( )#  " λ log log x 2 1/2 1/2 (xy) = x exp λ /4 − √ (log x) 1+O , (37) 2 (log x)1/2  2 h n oi 1 log(x/y) = 2 log x 1 + O (log x)−1/2 , (38) λ " √ #   1 λ2 λ 2z −1/2 1/2 log(x/y) = (2π) exp − log x − + √ (log x) + log log x + φ λ 8 λ 2 (39)  × 1+O



(log log x)2 log x

 .

Combining equations (36), (37), (38) and (39) we get ( ) n √ o λ log log x Pr {Y < a(x) + b(x)z | X = x} = 1 − exp − exp − 2z/λ + O . (8π)1/2 (log x)1/2 

Last, direct application of statement (2) yields the result of Proposition 2.

14

4.4

Proof of Lemma 2

(i) First, for any auxiliary function f , we have that limt→0+ f (t)/t = 0 (see de Haan, 1970, Lemma 1.5.1). Next, for U (w) = wν L(w), where L ∈ R0 (0+ ) and ν ∈ R, lim

t→0+

U {t + wf (t)} g({t + wf (t)}) L [t {1 + wf (t)/t}] g {t + wf (t)} = lim {1 + wf (t)/t} , + U (t)g(t) L(t) g(t) t→0 → exp(w),

w > 0.

(ii) Theorem 1.5.2 in de Haan (1970) asserts that any function f0 that satisfies Z w  f0 (w) ∼ g(s)ds /g(w) as w → 0+ ,

(40)

0

is an auxiliary function for g. Additionally, Theorem 1.5.4 in de Haan (1970) states that if g ∈ Γf (0+ ), then, as w → 0+ Z w Z w 1 2 {g(s)} ds ∼ g(w) g(s) ds. (41) 2 0 0 Given that limit (25) exists it follows by l’Hˆopital’s rule that Rw Rw 1 1 g(s) ds g(w)2 + 12 g 0 (w) 0 g(s) ds 2 g(w) 2 0 Rw = lim , lim 2 g(w)2 w→0+ w→0+ 0 {g(s)} ds

(42)

since the right hand side limit in (42) exists due to limit (25) existing. However, by property (41), we have the left hand side limit in (42) is equal to 1 and so it follows from equality (42) that as w → 0+ , Z w

g(s) ds ∼ {g(w)}2 /g 0 (w).

(43)

0

Combining expressions (40) and (43) we obtain f0 (w) ∼ f (w), as w → 0+ , where f (w) = g(w)/g 0 (w). Hence, for w > 0, the function f (w), is, up to asymptotic equivalence, equal to f0 . (iii) Define f (w) = g(w)/g 0 (w), w > 0. We have Z w Z w U (s)g(s) ds = U (s)f (s)g 0 (s) ds 0

0

Z = U (w)f (w)g(w) −

w

(U f )0 (s)f (s)g 0 (s) ds

0 0

0

Z

= U (w)f (w)g(w) − (U f ) (w)f (w)g (w) +

w



(U f )0 f

0

(s)f (s)g 0 (s) ds,

0

which gives expression (26), after continuation of integration by parts and division by U (w)f (w)g(w). Last, expression (27) follows from de Haan’s theorem (40) and case (i) of Lemma 2.

4.5

Proof of Proposition 3

Similarly with the Proof of Proposition 1, we consider the case D = ∅. Define c(x, y) = T (y)/ {x + T (y)} and l(x, y) = c(x, y) − w1 . For c(x, y) → w1 , l(x, y) = c(x, y) − w1 → 0, as x → ∞, and with T as in expression (10), we have that log Pr (Y > y | X = x) − log {1 − w1 H({w1 })} − {w1 x − (1 − w1 )T (y)} H({w1 }) 15

is equal to Z

l(x,y)

Z

l(x,y)

g(s) ds.

sg(s) ds + {w1 x − (1 − w1 )y}

{x + T (y)}

(44)

0

0

Using the asymptotic expansion (26), up to first order, we have that the two integrals in expression (44) are, as x → ∞, approximately equal to   Z l(x,y) f {l(x, y)} − l(x, y)f 0 {l(x, y)} . sg(s) ds = l(x, y)f {l(x, y)} g {l(x, y)} 1 − , (45) l(x, y) 0 Z l(x,y)   . g(s) ds = f {l(x, y)} g {l(x, y)} 1 − f 0 {l(x, y)} , (46) 0

for U (s) = s ∈ R1 (0+ ) and U (s) = 1 ∈ R0 (0+ ), respectively. Combining expressions (44), (45) and (46) we get that (44) is approximately equal to    f {l(x, y)} − l(x, y)f 0 {l(x, y)} 0 f {l(x, y)} g {l(x, y)} − {w1 x − (1 − w1 )T (y)} f {l(x, y)} − l(x, y)   {x + T (y)}f {l(x, y)} = f {l(x, y)} g {l(x, y)} −{w1 x − (1 − w1 )T (y)} w1 x − (1 − w1 )T (y)     T (y) T (y) 2 = − {x + T (y)} f − w1 h , x + T (y) x + T (y) which completes the proof.

4.6

Proof of Proposition 4

Assuming y → ∞ as x → ∞ with y/x → 0, we have from expressions (11), (28), (29), (30) and Lemma 1, that for large x, the log-conditional survivor, log Pr (Y > y | X = x), is approximately equal to  x δ+2(1+γ) −γ − (y/x) exp −κ (y/x) . (47) (κγ)2 Now, let z ∈ R and y = a(x) + b(x)z, where a(x) and b(x) are given by equations (31). We have, as x → ∞,      log x −{δ+2(1+γ)}/γ log log x δ+2(1+γ) 1+O , (48) (y/x) = κ log x " (  {δ+2(1+γ)}/γ 2 )# n o  log x log log x exp −κ (y/x)−γ = x−1 exp γκ−1/γ z 1 + O .(49) κ log x Combining equations (47), (48) and (49) we get   n  o log log x −2 −1/γ Pr {Y < a(x) + b(x)z | X = x} = 1 − exp − (κγ) exp γκ z +O . log x Last, direct application of statement (2) yields the result of Proposition 4.

16

4.7

Proof of Proposition 5

Consider the pair of random variables (X, Y ) with joint density (33). The marginal density function of Y is, for y > 0, fY (y) = =

1 exp {θ − θy y} θx + θxy y exp(θ) exp(θ) − (θx θy + θxy )y + O(y 2 ), θx θx2

as y → 0.

(50)

Using the Taylor power series representation (50) of the marginal density, we have that as y → 0, the marginal distribution function of Y is equal to FY (y) =

exp(θ) y + O(y 2 ). θx

Now, let z ∈ R and y = a(x)+b(x)z, where a(x) and b(x) as defined in Proposition 5. From (34), we have that, for g(u) = log(2u), u < 1/2, FY← (u) = θx u/ exp(θ) and as x → ∞, Pr [g {FY (Y )} < a(x) + b(x)z | X = x] = Pr {Y < FY← [exp{a(x) + b(x)z}/2] | X = x} = 1 − exp [−θx (θy + θxy x) exp{a(x) + b(x)z − θ}/2] → 1 − exp {−θx θxy exp (z − θ) /2} . Last, direct application of statement (2) yields the result of Proposition 5.

References Ballani, F. and Schlather, M. (2011), ‘A construction principle for multivariate extreme value distributions’, Biometrika 98, 633–645. 4 Brown, B. M. and Resnick, S. I. (1977), ‘Extreme values of independent stochastic processes’, J. Appl. Probability 14, 732–739. 2 Coles, S. G. and Tawn, J. A. (1991), ‘Modelling extreme multivariate events’, J. Roy. Statist. Soc., B 53, 377–392. 3, 4 Davison, A. C. and Gholamrezaee, M. M. (2012), ‘Geostatistics of extremes’, Proc. R. Soc. Lond. Ser. A 468, 581–608. 2 Davison, A. C., Padoan, S. and Ribatet, M. (2012), ‘Statistical modelling of spatial extremes (with discussion)’, Statist. Science 27, 161–186. 2 de Haan, L. (1970), On Regular Variation and its Application to the Weak Convergence of Sample Extremes, Vol. 32 of Mathematical Centre Tracts, Mathematisch Centrum, Amsterdam. 3, 15 de Haan, L. (1984), ‘A spectral representation for max-stable processes’, Ann. Probab. 12, 1194– 1204. 2 Demarta, S. and Mcneil, A. J. (2005), ‘The t copula and related copulas’, International Statistical Review 73, 111–129. 2, 3 17

Eastoe, E. F. and Tawn, J. A. (2012), ‘The distribution for the cluster maxima of exceedances of sub-asymptotic thresholds’, Biometrika 99, 43–55. 2 Heffernan, J. E. and Resnick, S. I. (2007), ‘Limit laws for random vectors with an extreme component’, Ann. Appl. Prob. 17, 537–571. 1, 2, 12 Heffernan, J. E. and Tawn, J. A. (2004), ‘A conditional approach for multivariate extreme values (with discussion)’, J. Roy. Statist. Soc., B 66, 1–34. 1, 2, 3, 4, 6, 7, 8, 9 Hilal, S., Poon, S.-H. and Tawn, J. A. (2011), ‘Hedging the black swan: conditional heteroskedasticity and tail dependence in S&P500 and VIX.’, Journal of Banking and Finance 35, 2374– 2387. 2 H¨ usler, J. and Reiss, R.-D. (1989), ‘Maxima of normal random vectors: between independence and complete dependence’, Statist. Probab. Letters 7, 283–286. 4 Joe, H. (1997), Multivariate Models and Dependence Concepts, Chapman & Hall, London. 2 Kabluchko, Z., Schlather, M. and de Haan, L. (2009), ‘Stationary max-stable fields associated to negative definite functions’, Ann. Probab. 37, 2042–2065. 2 Keef, C., Papastathopoulos, I. and Tawn, J. A. (2013), ‘Estimation of the conditional distribution of a multivariate variable given that one of its components is large: additional constraints for the Heffernan and Tawn model’, J. Mult. Anal 115, 396–404. 2, 12 Keef, C., Svensson, C. and Tawn, J. A. (2009), ‘Spatial dependence in extreme river flows and precipitation for Great Britain’, J. Hydrology 378, 240–252. 2 Latham, M. (2006), Statistical Methodology for the Extreme Values of Dependent Processes, PhD thesis, Lancaster University. 2 Leadbetter, M. R. (1983), ‘Extremes and local dependence in stationary sequences’, Z. Wahrsch. verw. Gebiete 65, 291 – 306. 12 Ledford, A. W. and Tawn, J. A. (1997), ‘Modelling dependence within joint tail regions’, J. Roy. Statist. Soc., B 59, 475–499. 4 Marshall, A. W. and Olkin, I. (1967), ‘A multivariate exponential distribution’, J. Amer. Statist. Assoc. 62, 30–44. 7 Nelsen, R. B. (1999), An Introduction to Copulas, Springer-Verlag, New York. 2 Nikoloulopoulos, A. K., Joe, H. and Li, H. (2009), ‘Extreme value properties of multivariate t copulas’, Extremes 12, 129–148. 2, 3, 7 Paulo, M., van der Voet, H., Wood, J., Marion, G. and van Klaveren, J. (2006), ‘Analysis of multivariate extreme intakes of food chemicals’, Food and Chemical Toxicology 44(7), 994– 1005. 2 Resnick, S. I. (1987), Extreme Values, Regular Variation and Point Processes, Springer–Verlag, New York. 14 Schlather, M. (2002), ‘Models for stationary max-stable random fields’, Extremes 5, 33–44. 2, 4, 7

18

Smith, R. L. (1990), Max-stable processes and spatial extremes, Technical report, University of North Carolina. 2 Smith, R. L. (2004), ‘Discussion of paper by J. Heffernan, and J.A. Tawn’, J. Roy. Statist. Soc., B 66, 1–34. 2 Tawn, J. A. (1988), ‘Bivariate extreme value theory: models and estimation’, Biometrika 75, 397–415. 7 Wadsworth, J. L. and Tawn, J. A. (2012), ‘Dependence modelling for spatial extremes’, Biometrika 99, 1–20. 2, 7 Wadsworth, J. L., Tawn, J. A., Davison, A. C. and Elton, D. M. (2014), ‘Modelling across extremal dependence classes’. Submitted. 2

19

Suggest Documents