Examining applicability of a new technique for threshold selection in

0 downloads 0 Views 585KB Size Report
Sep 1, 2010 - important data. q q q q q q q q qq q q q q q qq q qq q q q q q q qqqq q q q ..... If the evidence against H0 is strong then the p-value will be small.
A new technique for threshold selection in extreme value modelling

Examining applicability of a new technique for threshold selection in extreme value modelling Samantha Hinsley

Jenny Wadsworth

Lancaster University

September 1, 2010

A new technique for threshold selection in extreme value modelling Extreme value modelling introduction

An example of why extreme values are important I

Oil rigs in the sea are built to withstand a certain amount of energy.

I

Not strong enough - dangerous, the rig will be damaged Stronger than necessary - not cost effective

I

A new technique for threshold selection in extreme value modelling Extreme value modelling introduction

An example of why extreme values are important I

Oil rigs in the sea are built to withstand a certain amount of energy.

I

Not strong enough - dangerous, the rig will be damaged Stronger than necessary - not cost effective Use the extreme data values of wave energy to try to work out a suitable level of energy which the rig must be able to withstand.

I I

A new technique for threshold selection in extreme value modelling Extreme value modelling introduction

How extreme values are modelled

I

Investigating extreme values involves looking at the tails of probability distributions.

PROBLEM: I I

What makes a value extreme? Where should we start modelling the tail?

A new technique for threshold selection in extreme value modelling Extreme value modelling introduction

How extreme values are modelled

I

Investigating extreme values involves looking at the tails of probability distributions.

PROBLEM: I I

What makes a value extreme? Where should we start modelling the tail?

I

This project looked at a new method for finding the answer to these questions.

I

The data sets I used were measurements of ‘ocean energy’.

A new technique for threshold selection in extreme value modelling Extreme value modelling introduction

The use of thresholds

10



8



● ● ● ● ● ●

6

Data values

I

One way to look at extreme values is to choose a threshold and look only at data above this point. This method is used most since other methods can ignore important data.





● ●





● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ● ●● ● ● ● ●● ●● ● ●● ●● ● ●● ●●● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ●● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ●● ●● ●● ●● ●● ●● ● ● ● ●●● ●● ●●●● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ●● ● ●● ●● ● ● ● ● ● ●● ●●● ● ●●●●●●● ● ●● ● ●● ● ●●● ● ● ● ● ● ● ● ● ●●●●● ● ●●● ●● ●● ● ● ● ●●●● ● ● ● ●●● ● ● ● ● ●●●●● ●● ● ●● ● ●● ●● ● ● ● ●● ● ● ● ● ●●● ● ● ● ● ● ●●● ●● ●●● ● ●● ● ●● ● ● ●●●●● ● ●● ●●●● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ●●● ●● ●● ● ●● ● ● ●● ●●● ● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ●●●● ● ●●●● ●● ● ● ●● ● ●● ● ● ●● ● ●●● ●● ● ● ● ●● ● ● ● ●● ●● ●●●● ● ● ● ● ● ●● ● ● ●●● ● ●● ● ●● ● ●●●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ● ● ●







4

I

0



● ● ● ● ●

100

●● ● ●











●●







200

300 Index

400

500

A new technique for threshold selection in extreme value modelling Extreme value modelling introduction

The use of the Generalized Pareto distribution

I

After choosing a threshold, the Generalized Pareto distribution can be used.

I

The Generalized Pareto distribution fits the tails of most probability distributions (e.g. Normal, Gamma) once a suitable threshold is chosen.

I

This means that most data sets collected can be evaluated in this way.

A new technique for threshold selection in extreme value modelling Current method of threshold selection Choosing a suitable threshold

5 3

● ●



































I

1



−1

Modified Scale

Initial threshold selection: parameter estimates

0

1

2

3

4

Shape





























● ●









−1.2

−0.6

0.0

Threshold

0

1

2

3

4

Threshold

Figure: Parameter estimates of the GPD model fitted over a range of thresholds

In these two plots, we look to construct a horizontal line, from the lowest threshold possible, that cuts through all the confidence bars.

A new technique for threshold selection in extreme value modelling Current method of threshold selection Choosing a suitable threshold

0.0

0.5

Mean Excess

1.0

1.5

Initial threshold selection: MRL plot

1

2

3

4

5

u

Figure: Mean Residual Life plot

I I

The MRL plot shows the mean number of excesses over the threshold, u, in between a confidence interval (approx 95%). We look for approximate linearity (from the lowest possible threshold) whilst keeping in between the confidence bounds.

A new technique for threshold selection in extreme value modelling Current method of threshold selection Checking the chosen threshold

Checking threshold suitability Quantile Plot

0.2

0.4

0.6

0.8

5.0

2.0

2.5

3.0

3.5

4.0

4.5

Empirical

Model

Return Level Plot

Density Plot

6



5.0



0.2

4

f(x)

5

●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.0

2

1e−01

1e+01

1e+03

Return period (years)

2

3

4

5

x

Figure: Post threshold calculation diagnostic plots

I

After choosing our threshold, we create post- calculation diagnostic plots (with the threshold specified) to check its suitability.

I

We desire linearity in the first 2 plots, points in between the confidence bounds in the 3rd and a curve that matches the shape of the histogram in the 4th plot.

5.5

3

Return level

4.0

Empirical

3.0

1.0



●● ●● ●● ● ●● ●●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.6

0.0



2.0

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

0.4

0.6 0.4 0.0

0.2

Model

0.8

1.0

Probability Plot

6

A new technique for threshold selection in extreme value modelling Current method of threshold selection Problems with this method of threshold selection

Why do we need a new method of threshold selection?

A new technique for threshold selection in extreme value modelling Current method of threshold selection Problems with this method of threshold selection

Why do we need a new method of threshold selection?

I

Difficulty in initially choosing a threshold from the first graphs.

A new technique for threshold selection in extreme value modelling Current method of threshold selection Problems with this method of threshold selection

Why do we need a new method of threshold selection?

I

Difficulty in initially choosing a threshold from the first graphs.

I

Difficulty in interpreting the post-calculation diagnostic plots. - Why one threshold may be better than another when the difference between them is small. - Different people can interpret the graphs in different ways.

A new technique for threshold selection in extreme value modelling Current method of threshold selection Problems with this method of threshold selection

Why do we need a new method of threshold selection?

I

Difficulty in initially choosing a threshold from the first graphs.

I

Difficulty in interpreting the post-calculation diagnostic plots. - Why one threshold may be better than another when the difference between them is small. - Different people can interpret the graphs in different ways.

I

Room for human error when looking only at graphs.

A new technique for threshold selection in extreme value modelling Current method of threshold selection Problems with this method of threshold selection

Why do we need a new method of threshold selection?

I

Difficulty in initially choosing a threshold from the first graphs.

I

Difficulty in interpreting the post-calculation diagnostic plots. - Why one threshold may be better than another when the difference between them is small. - Different people can interpret the graphs in different ways.

I

Room for human error when looking only at graphs.

I

Often difficult to identify the ’best’ threshold.

A new technique for threshold selection in extreme value modelling A new technique LR testing & p-values

Likelihood ratio tests I

Likelihood ratio tests are a type of hypothesis test.

I

They are used to test a null hypothesis, H0 .

I

H0 is the default state of the world.

I

They can be used to test at different significance levels.

A new technique for threshold selection in extreme value modelling A new technique LR testing & p-values

Likelihood ratio tests I

Likelihood ratio tests are a type of hypothesis test.

I

They are used to test a null hypothesis, H0 .

I

H0 is the default state of the world.

I

They can be used to test at different significance levels.

POSSIBLE RESULTS: I

Reject H0 .

I

Do not reject H0 (if there isn’t significant evidence to do so).

A new technique for threshold selection in extreme value modelling A new technique LR testing & p-values

p-values

I

With LR tests we find a p-value: the probability of obtaining a test statistic at least as large as the one observed if H0 were true.

I

If the evidence against H0 is strong then the p-value will be small.

I

Before conducting a test, set a significance level, α (e.g. 0.05), and decide to reject H0 if the p-value < α.

A new technique for threshold selection in extreme value modelling A new technique Data & plots from old method

Gulf of Mexico data Data analysis: I

Min = 2.573358

I

Max = 9.910259

I

0 missing values

I

505 data values

A new technique for threshold selection in extreme value modelling A new technique Data & plots from old method

Initial plots

2.5

5





● ●









3.0

3.5

4.0

4.5

5.0



5.5

6.0

0.5

1.0

Threshold



















● ●

−0.5

0.0



−0.5

0.0







−1.0

Shape

2.0



1.5



1.0



Mean Excess



0



0.5



−5

Modified Scale

These plots suggest a threshold of 3.5 would be suitable.

3.0

3.5

4.0

4.5

Threshold

5.0

5.5

6.0

4

6

8

u

Figure: Graphs before choosing threshold

10

A new technique for threshold selection in extreme value modelling A new technique Data & plots from old method

Post-threshold choice diagnostic plots

0.2

0.4

0.6

0.8

9 8 7 6

Empirical

5

1.0

● ●



● ● ●●●● ●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4

5

6

7

Empirical

Model

Return Level Plot

Density Plot

25

8

f(x)

0.8

20

0.4

15 10

Return level





1.2

0.0

10

Quantile Plot

● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

4

0.6 0.4 0.0

0.2

Model

0.8

1.0

Probability Plot

● ●

1e−01

0.0

5

●● ●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

1e+01

1e+03

Return period (years)

4

5

6

7

8

9

10

x

Figure: Diagnostic plots for threshold 3.5

A new technique for threshold selection in extreme value modelling A new technique Implementing the new technique

New technique introduction

I

Likelihood ratio test to look at the threshold value.

Hypotheses: H0 : A single tail model is ok for the threshold. H1 : Single tail model is not ok for the threshold.

I

The hypotheses are tested by comparing a single tail model with a tail model that has a change point, u.

A new technique for threshold selection in extreme value modelling A new technique Implementing the new technique

































u

4.5













4.0





























































I

The triangle plot shows the test statistics for models tested with different lowest thresholds (v) and different change points (u).

I

The larger the test statistic, the larger the circle on the plot.

I

Large test statistics indicate where the lowest threshold may be too low.













5.0

Triangle plot





3.5

● ●

3.0

3.5

4.0

4.5

v

Figure: Triangle plot

5.0

A new technique for threshold selection in extreme value modelling A new technique Implementing the new technique

How to find the best threshold

I

Run 200 simulations of the data under H0 from the chosen threshold to find the distribution of the LR test.

I

Use this distribution to find the p-value.

I

If p < 0.05, increase the threshold and repeat the first 2 steps.

A new technique for threshold selection in extreme value modelling A new technique Results

Gulf of Mexico results I

p < 0.05 - significant evidence to reject H0 (at 95% significance level).

I

p ≥ 0.05 - not enough evidence to reject H0 , no evidence against a single tail model from the chosen threshold. Threshold (v) 3 3.1 3.1 3.2

p-value 0.015 0.135 0.12 0.43

Table: Thresholds and corresponding p-values

A new technique for threshold selection in extreme value modelling A new technique Analysis

Advantages of the new method

I

More accurate

I

Easier to interpret

I

Less room for human error

A new technique for threshold selection in extreme value modelling A new technique Analysis

Limitations

I

Other variables not taken into account (e.g. wave direction).

I

If H0 always rejected, it could be that no single model fits for the tail but this can’t be guaranteed.

I

Seasonality in the data can affect the results.

I

The triangle plot can be misleading if there is only a small number of data points.

A new technique for threshold selection in extreme value modelling Conclusion

Conclusion I

The old method is useful to see the area where the threshold should occur.

I

The new technique helps us to see more accurately if a chosen threshold is ok.

I

Overall, the new method of threshold selection removes a lot of the problems that were found in the old method.

I

Although the new technique has limitations, many of these limitations would also be seen, or even undetected, in the old method too.

A new technique for threshold selection in extreme value modelling Conclusion

Conclusion I

The old method is useful to see the area where the threshold should occur.

I

The new technique helps us to see more accurately if a chosen threshold is ok.

I

Overall, the new method of threshold selection removes a lot of the problems that were found in the old method.

I

Although the new technique has limitations, many of these limitations would also be seen, or even undetected, in the old method too.

I

Best solution: use the old method to indicate where the threshold might lie, then use new method to quantify the credibility of a chosen threshold.

A new technique for threshold selection in extreme value modelling Conclusion

Thank you for listening. Any questions? Wadsworth, J. L. and Tawn, J. A. (2010). Likelihood-based Procedures for Threshold Diagnostics and Uncertainty in Extreme Value Modelling. Coles, S. (2007). An Introduction to Statistical Modeling of Extreme Values. Springer-Verlag, London, 4th edition. DeGroot, M. H. and Schervish, M. J. (2002). Probability and Statistics. Addison-Wesley, 3rd edition.

Suggest Documents