Examining applicability of a new technique for threshold selection in ...

35 downloads 0 Views 296KB Size Report
technique for threshold selection in extreme value modelling. Samantha Hinsley Jenny Wadsworth. Maths & Stats, Lancaster University. Background information.
Examining applicability of a new technique for threshold selection in extreme value modelling Samantha Hinsley

Jenny Wadsworth

Maths & Stats, Lancaster University

Why look at a new method of threshold selection?

Background information Investigating extreme values involves looking at the tails of probability distributions. A problem with this lies in deciding where we should model the tail from. This project looked at a new method for defining such a threshold, using data sets of ‘ocean energy’ values over time in places such as the Gulf of Mexico.

The Generalized Pareto distribution One way to look at extreme values is to choose a threshold and look only at data above this point. After choosing a threshold, the Generalized Pareto distribution can be used. The GP distribution fits the tails of most probability distributions, meaning that most data sets collected can be evaluated in this way.

Threshold selection

5.0

5.5

6.0

1.0 0.5





● ●



3.5

4.0

4.5

5.0

5.5

4.0

6.0

4

6

8

10

u

estimates

against

(b) Mean residual life plot

Figure 1: Plots to aid threshold selection.

After choosing our threshold, we create postcalculation diagnostic plots where we desire linearity amongst the fulfillment of other criteria to check the threshold’s suitability (Figure 2).

0.2

0.4

0.6

0.8





















































































3.0

3.5

4.0

4.5

5.0

v

Figure 3: Triangle plot

Figures 1, 2 (old method) and 3 (new) help in choosing where to begin looking at thresholds.

10 9 8 7 6 5 4

1.0





● ● ●●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4

5

6

7

Model

Return Level Plot

Density Plot

8

25

f(x)

0.8

20

0.4 0.0

5



1e+01



1e+03

Return period (years)

4

5

6

7

8

9

10

x

Figure 2: Post threshold calculation diagnostic plots.

STOR-i internship 2010

The Gulf of Mexico results are shown in table 1. Threshold (v) p-value 3 0.015 3.1 0.135 3.1 0.12 3.2 0.43 Table 1: Thresholds and p-values.

Table 1 suggests that 3.1 is a suitable threshold. This is lower than what would have been chosen from the old method. The results show that there is no evidence to suggest why a threshold above 3.1 is needed.

Analysis of the new method Advantages: More accurate Easier to interpret Less room for human error Limitations: Other variables not taken into account (e.g. wave direction). If H0 always rejected, it could be that no single model fits for the tail but this cannot be guaranteed. Seasonality in the data affects the results - the p-values can be misleading if the data follows a different trend for different times of year. (Overcome by looking only at winter data.) The triangle plot can be misleading if there is only a small number of data points - the test statistics can become small if there is a small amount of data above the threshold.

Conclusion





Empirical

15 10

Return level

1e−01









●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●









1.2

0.0







Finding the best threshold

Quantile Plot

Empirical

0.6 0.4 0.0

0.2

Model

0.8

1.0

Probability Plot ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●





3.5

Threshold

(a) Parameter threshold





−0.5

−1.0

3.0



u







0.0











−0.5

0.0

● ●





4.5

Threshold





4.0

Mean Excess

3.5

5.0







2.0









3.0

Shape

Figure 3 shows the test statistics for models tested with different lowest thresholds (v) and different change points (u). The larger the test statistic, the larger the circle on the plot. Large test statistics indicate where the lowest threshold may be too low.

4.5







Hypotheses to test with LR test: H0: A single tail model is ok for chosen threshold. H1: Single tail model is not ok for the threshold. We test the hypotheses by comparing a single tail model with a tail model that has a change point, u (not a single tail).

1.5



New method - Hypotheses

1.0





LR tests are a type of hypothesis test. They use the likelihood function to test a null hypothesis, H0. With LR tests we find a p-value: the probability of obtaining a test statistic at least as large as the one observed if H0 were true. If the evidence against H0 is strong then the p-value will be small.

0.5



0



Likelihood Ratio tests and p-values

Triangle plot

2.5

5 ●

−5

Modified Scale

To choose a threshold for the GPD model to be fitted from, 2 types of graphs are initially plotted. In Figure 1(a), we look to construct a horizontal line, that cuts through all the confidence bars. In Figure 1(b), we look for approximate linearity whilst keeping in between the confidence bounds. In both plots, we want to choose the lowest possible threshold that fits the criteria.

Using only graphs leaves room for error and misinterpretation. It is difficult to decide on the ‘best’ threshold with this method.

Results

Run 200 simulations of the data under H0 from the chosen threshold to find the distribution of the LR test. Use the distribution to find the p-value. If p < 0.05, increase the threshold and repeat. Where the threshold is thought to be, run another set of 200 simulations to ensure accuracy. When p < 0.05, there is significant evidence to reject H0. When p ≥ 0.05, there is not enough evidence to reject H0, so proceed to use single tail model.

The old method is useful to see the area where the threshold should occur. The new technique is better to see accurately where the threshold actually lies. Overall, the new method of threshold selection removes a lot of the problems that were found in the old method. Although the new technique has limitations, many of these limitations would also be seen, or even undetected, in the old method too. Best solution: use the old method to indicate where the threshold might lie, then use the new method to quantify credibility of a chosen threshold. By doing this, it can be seen whether the result from the new technique is likely to be correct or whether the results have been affected by an outside factor such as seasonality.

References Wadsworth, J. L. and Tawn, J. A. (2010). Likelihood-based Procedures for Threshold Diagnostics and Uncertainty in Extreme Value Modelling. Coles, S. (2007). An Introduction to Statistical Modeling of Extreme Values. Springer-Verlag, London, 4th edition. DeGroot, M. H. and Schervish, M. J. (2002). Probability and Statistics. Addison-Wesley, 3rd edition.

[email protected]

Suggest Documents