Journal of Coastal Research
30
5
967–974
Coconut Creek, Florida
September 2014
Interval Estimations of Return Wave Height Based on Maximum Entropy Distribution Sheng Dong†*, Shanshan Tao†, Chengchao Chen†, and C. Guedes Soares‡ † College of Engineering Ocean University of China Qingdao 266100, China
‡ Centre for Marine Technology and Engineering (CENTEC) Instituto Superior T´ecnico, Technical University of Lisbon 1049-001 Lisbon, Portugal
ABSTRACT Dong, S.; Tao, S.; Chen, C., and Soares, C.G., 2014. Interval estimations of return wave height based on maximum entropy distribution. Journal of Coastal Research, 30(5), 967–974. Coconut Creek (Florida), ISSN 0749-0208. Five different interval estimation methods for extreme wave heights based on maximum entropy distribution are considered and compared. Three are parametric methods: Woodruff, maximum likelihood, and sample quantile asymptotic, and two are nonparametric methods: order statistics and sign test. The extreme significant wave height is fitted by a maximum entropy distribution, which is then used to conduct numerical simulations so as to apply the interval estimation methods to the 100 year return period estimates. These simulation results show that parametric methods have generally better performance than the nonparametric ones. Finally, a case study using the extreme wave height from Weizhou Island in the South China Sea is considered, and it is shown that the maximum likelihood method gives the best interval estimation for the actual data.
ADDITIONAL INDEX WORDS: Interval estimation, maximum entropy distribution, return wave height.
INTRODUCTION Return wave heights must be taken into consideration in coastal engineering design, since the lifetime and reliability of offshore structures are directly influenced by the design wave height. In engineering applications, point estimations of return values of significant wave height are used more extensively, although confidence intervals are also provided in various cases (Ferreira and Guedes Soares, 1998; Guedes Soares and Scotto, 2004). Using only point estimations has disadvantages, as different point estimation methods will result in different return wave heights. If the structures are designed with the lower point estimation of extreme wave height, the actual service lifetime of offshore structures may decrease, but if too high a point estimation is used, the cost will increase unnecessarily. Calculation of the interval estimation of extreme significant wave heights is a good choice to avoid this disadvantage of point estimations. For this assessment, various interval estimations of the return wave height can be obtained by different methods, and a region can be chosen from the overlapping intervals, as an alternative to combining various point estimates as suggested earlier by Guedes Soares (1988). Maximum entropy distribution (MED) is an extension of some other distributions that can fit the extreme wave heights, such as Gumbel distribution, Weibull distribution, log-normal distribution, and Pearson type III distribution (Dong, Xu, and Liu, 2009). The return periods theoretically correspond to the quantiles of the long-term distributions to which the random variable conforms (Borgman, 1963). So, any interval estimation DOI: 10.2112/JCOASTRES-D-12-00099.1 received 23 May 2012; accepted in revision 28 August 2012; corrected proofs received 7 November 2012; published pre-print online 7 December 2012. *Corresponding author:
[email protected] Ó Coastal Education & Research Foundation 2014
method of the quantiles can be used in the calculation of the confidence intervals of a given return period for significant wave height. In statistics, the parametric interval estimation methods of quantiles often use the asymptotic normality of the distribution. In this paper, the quantiles’ interval estimation of MED is applied to represent the interval estimation of the return wave height.
MAXIMUM ENTROPY DISTRIBUTION Jaynes (1968) introduced the maximum entropy principle determining the probability distribution, and this type of distribution consists of the largest uncertainty under additional constraint conditions. Zhang and Xu (2005) proposed one kind of MED, the distribution function of which is as follows: Z x h i GðxÞ ¼ ð1Þ aðt a0 Þc exp bðt a0 Þn dt a0
in which b, c, n, and a0 are parameters of the MED to be determined from the sample of data. Here, a is a combination of b, c, and n, and is given by: cþ1 a ¼ nbðcþ1Þ=n C1 ð2Þ n The moment estimation method of the parameters for the MED was presented by Zhang and Xu (2005). Since the sample size of actual observations is often not large enough, the skewness coefficient Cs always has large errors. For this reason, Dong et al. (2009) applied a curve-fitting method to estimate the parameters of the MED, and the results show that it works well. In this paper, the maximum likelihood method is used to estimate the parameters of the MED, as proposed by Dong et al. (2012).
968
Dong et al.
INTERVAL ESTIMATION METHODS FOR RETURN WAVE HEIGHT Borgman (1963) pointed out that the wave height return period corresponds to a quantile of the long-term distribution of significant wave heights. For example, if the annual extreme wave height follows a MED, then the 100 year wave height return period just corresponds to the top 0.01 quantile of the MED. Generally, the N year return wave height corresponds to the top 1/N quantile of that respective distribution. So, nearly any interval estimation method of the quantiles of the population could be applied to the interval estimation of the return wave heights. The exact sample distributions of the statistics of quantiles are often hard to solve, so research on the interval estimations of quantiles is not very popular. It is necessary to apply the asymptotic normality of the statistics of quantiles to give some asymptotic interval estimations. All parametric interval estimation methods adopt the use of the property of asymptotic normality. Woodruff (1952) described a procedure for theoretically determining the confidence intervals for a finite population median. Garc´ıa, Cebria´n, and Rodr´ıguez (1998) revised Woodruff’s method (WM) and presented a simulation study. For any distribution function, the magnitude of the event (or quantile) of a return period can be calculated by using the general equation xT ¼ x¯ 6 Ks (Chow, 1964), where xT is the T year return value, x¯ and s are the sample mean and standard deviation, respectively, and K is a frequency factor. For a given distribution function, K is a function of the parameters of the distribution, the size of the sample, and the exceedance ˆ D´ıaz-Delgado, probability associated with the quantile (Ba, ˆ and Carsteanu, 2001). An approximate 1 – a confidence interval for xT is given by xˆT þ ua/2sT, where ua/2 denotes the upper a/2 quantile of the standard normal distribution, and sT is the standard error of xT (Cunnane, 1989; Rao and Hamed, 2000). Detailed discussions about sT for some distribution functions are available in various books of statistics, and the method of moment (MOM), maximum likelihood method (MLM), and power-weighted method (PWM) of sT are described, for example, in Rao and Hamed (2000) or in Kite (1977). Coles (2007) applied MLM to solve the interval estimations of the quantiles for Gumbel distributions and generalized extreme value (GEV) distributions in actual calculation of return values. Cramer (1999) listed the mathematic expectation and variance of sample p quantile xp* based on the order statistics. If the data are independent and identically distributed and satisfy certain conditions (Avramidis and Wilson, 1998; van Zwet, 1964), David (1981) revised the results given by Cramer (1999). When the sample size is large enough, the sample quantile follows a normal distribution, for which the mathematic expectation and variance can be solved using the previously described methods. Also, we can use this sample quantile asymptotic method (SQAM) to calculate the confidence intervals of return values. There are different forms of confidence intervals between MED and other probability distributions when using the three methods, WM, MLM, and SQAM. They all apply the asymptotic ˆ normality of the estimated distribution G(x) at xp (upper p
ˆ cˆ , n, ˆ and aˆ 0), and the quantile), the estimation of parameters (b, sample quantile xp*. Sun (2000) listed the joint distribution of marginal distribution functions for two different order statistics, and this can result in an order statistic method (OSM) to solve the confidence intervals of quantiles. Tao et al. (2011) proposed a sign test method (STM) to estimate the interval of return extreme wave height. These two methods are based on nonparametric statistics, so they are independent of the distribution types. Therefore, OSM and STM can also be applied in the calculation of confidence intervals for the MED. Let x1, x2, . . ., xn be an independent, identically distributed sample from a MED for which the distribution function is as in Equation (1). The top p quantile of the population is defined as: xp ¼ inf f xjGðxÞ 1 pg ¼ G1 ð1 pÞ
ð3Þ
So, if p ¼ 1/N, xp is the N year return period. Assume that the maximum likelihood estimations of b, c, n, and a0 of this MED ˆ cˆ , n, ˆ and aˆ 0, and the estimation of the distribution are b, ˆ function is G(x). These five different interval estimation methods of return wave heights are shown in the following sections.
Woodruff’s Method For any two constants d1 and d2, and for the upper p quantile xp, let n o n o ˆ p Þ d2 ffi P Gˆ 1 ðd1 Þ xp Gˆ 1 ðd2 Þ 1 a ¼ P d1 Gðx ð4Þ Hence, the interval [Gˆ1(d1), Gˆ1(d2)] is the 1 – a approximate confidence interval for xp. ˆ p) is asymptotically If the sample size is sufficiently large, G(x ˆ ˆ p)] ¼ p(1 – p)/n, normal. Since E[G(xp)] ¼ G(xp) ¼ 1 p and V[G(x the confidence interval is "
1 1 p ua=2 Gˆ
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!# 1 1 1 ˆ pð1 pÞ ; G pð1 pÞ 1 p þ ua=2 n n ð5Þ
where ua/2 denotes the top a/2 quantile of the standard normal distribution. Here, G1(x) could be calculated numerically by Equation (3).
Maximum Likelihood Method Let l(b, c, n, a0) be the log-likelihood function, then cþ1 cþ1 ln b ln C lðb; c; n; a0 Þ ¼ n ln n þ n n n n X X þc lnðxi a0 Þ b ðxi a0 Þn i¼1
ð6Þ
i¼1
If the sample size n is sufficiently large and under some regularity conditions, then : N ðb; c; n; a Þ; I ðb; c; n; a Þ 1 ˆ cˆ ; n; ˆ aˆ 0 Þ ~ ðb; 4 0 0 E
ð7Þ
where ~: denotes asymptotically obey, N4 implies four-dimensional normal distribution, and IE(b, c, n, a0) is the Fisher information matrix, which can be written as
Journal of Coastal Research, Vol. 30, No. 5, 2014
MED Interval Estimations
0
]2 l B ]b2 B B B 2 B ] l B B ]c]b B IE ðb; c; n; a0 Þ ¼ EðDÞ ¼ EB B ]2 l B B B ]n]b B B B ]2 l @ ]a0 ]b
1 ]2 l ]b]a0 C C C C 2 2 2 ] l ] l ] l C C 2 ]c ]c]n ]c]a0 C C C ]2 l ]2 l ]2 l C C C ]n]c ]n]a0 C ]n2 C C ]2 l ]2 l ]2 l C A ]a0 ]c ]a0 ]n ]a20 ]2 l ]b]c
]2 l ]b]n
ð8Þ So 1 p ¼ Gðxp Þ 0 1 Z xp h i ðcþ1Þ=n 1 @c þ 1A ðt a0 Þc exp bðt a0 Þn dt ¼ nb C n a0 0 1 Z bðxp a0 Þn 1 @c þ 1A ½ðcþ1Þ=n1 s ¼ C expðsÞdt n 0 0 1 cþ1 A ;1 ¼ F @bðxp a0 Þn ; n ð9Þ where F(x;a,b) denotes the distribution function of the gamma distribution with parameters a and b. The top p quantile xp can be denoted by
1=n 1 1 cþ1 F ;1 1 p; þ a0 xp ¼ xp ðb; c; n; a0 Þ ¼ b n
ð10Þ
For fixed probability p, the relational expression of yp ¼ F1(1 ˆ ˆ p; cþ1 n , 1) and c and n (here c, n are taken around c, n) could be fitted by linear fitting. Assume that the expression is given by cþ1 ; 1 ffi c0 þ c1 c þ c2 n ð11Þ F 1 1 p; n where ci (i ¼ 1,2,3) is the regression coefficient. For sufficiently large number n, the maximum likelihood estimation of xp asymptotically follows a normal distribution: 1=n 1 1 cˆ þ 1 : Nðx ; xˆT V ^ xˆp Þ þaˆ 0 ~ F ð1 p; xˆp ¼ ; 1Þ p p bˆ nˆ
969
Sample Quantile Asymptotic Method For the sample x1, x2, . . ., xn and any fixed probability p (0 , p , 1), x*p ¼ x([np]) can be denoted as the top p quantile of the sample, where x(i) is the ith largest number of the sample. For sufficiently large number n, pffiffiffi * : N 0; pð1 pÞ nðxp xp Þ ~ ð15Þ g2 ðxp Þ where g(x) is the probability density function of G(x), and h i ð16Þ gðxÞ ¼ aðx a0 Þc exp bðx a0 Þn then the 1 – a approximate confidence interval by using SQAM is rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! u1a=2 pð1 pÞ * xp 6 ð17Þ n gðxˆp Þ
Order Statistic Method Let x(1), x(2), . . ., x(n) be the order statistics of the sample x1, x2, . . ., xn. Then the joint density function of G(x(k)) and G(x(l)) is f ðu; vÞ ¼
Cðn þ 1Þ uk1 ðv uÞlk1 ð1 vÞnl CðkÞCðl kÞCðn l þ 1Þ ð18Þ
where 1 k l n and 0 , u , v , 1 . The following result can be deduced easily: Z p
Cðn þ 1Þ P XðkÞ , xp , XðlÞ ¼ sk1 ð1 sÞnk ds CðkÞCðn k þ 1Þ 0
¼1
1
tl1 ð1 tÞnl dt
0
Cðn þ 1Þ CðkÞCðn k þ 1Þ
Z
1p
sk1 ð1 sÞnk ds
0
a 2 ð20Þ
and
P xp XðlÞ ¼ P 1 p GðXðlÞ Þ ¼
ð13Þ 1
p
In order to get the 1 – a confidence interval of xp, k and l are chosen such that
P xp XðkÞ ¼ P 1 p GðXðkÞ Þ
Here ]x ]x ]x ]x ; ; ; ¼ d2 yp t k ; d1 c1 t k ; ðd1 c2 d3 ln tÞt k ; 1 ]b ]c ]n ]a0 1
Z
ð19Þ
ð12Þ
1 where V^ ¼ ½Djðb;c;n;a0 Þ¼ðb;ˆ and ˆ c;n; ˆ aˆ 0 Þ ]xp ]xp ]xp ]xp ; ; ; : xˆp ¼ ˆ ˆ ]b ]c ]n ]a0 ðb;c;n;a0 Þ¼ðb;ˆc;n;aˆ 0 Þ
Cðn þ 1Þ CðlÞCðn l þ 1Þ
Cðn þ 1Þ CðlÞCðn l þ 1Þ
Z 0
1p
sl1 ð1 sÞnl ds
a 2 ð21Þ
2 1
in which t ¼ b yp, k ¼ n – 1, d1 ¼ (nb) , d2 ¼ (nb ) , and d3 ¼ (n2b)1. Then, by using MLM the approximate interval estimation of xp with confidence degree 1 – a, we find qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ^ xˆp Þ ð14Þ ðxˆp 6 ua=2 xˆTp V
So the 1 – a confidence interval of xp is [x(k), x(l)]; here k and l should satisfy n o min i : F ðx Þ ð1 a=2Þ 0 and k ¼ 1in xðiÞ p n o l ¼ max i : F ðx Þa=2 , 0 ð22Þ 1jn
Journal of Coastal Research, Vol. 30, No. 5, 2014
xðiÞ
p
970
Dong et al.
n, 1 p) , a/2}. Then the interval estimation with STM is [x(k), x(l)].
EXAMPLE OF APPLICATION In the following, the annual extreme wave height data in the east direction measured at Weizhou Island hydrological station in the South China Sea (Xia and Li, 2001) are used to verify the efficiency of the previously described five interval estimation methods (see Figure 1).
Fitting and K-S Test for the Data with MED
Figure 1. The scatter plot of annual extreme wave heights in the east direction measured at Weizhoudao hydrological station from 1960 to 1993.
where Cðn þ 1Þ FxðiÞ ðxÞ ¼ CðiÞCðn i þ 1Þ
Z
GðxÞ
ni
si1 ð1 sÞ
The maximum likelihood estimations of the parameters of MED are as follows (Dong et al., 2012). Suppose the sample is x1, x2,. . ., xn and x(1) x(2) . . . x(n) is its order statistics, then based on the log-likelihood function l(b, c, n, a0) in Equation (6), the partial derivatives of the four parameters b, c, n, and a0 could be given as Equations (26a–d).
n ]/ X 1 cþ1 ¼ ðxi a0 Þn ¼ 0 ð26aÞ ]b b n i¼1 8 9 cþ1 > > > C0 = ]/ ln b 1 n þ lnðxi a0 Þ ¼ 0 ¼ > n > cþ1 ]c n > i¼1 > : ; C n n > < X
ds:
0
ð26bÞ
Sign Test Method Since xp is the top p quantile, every sample point is smaller than xp in probability 1 – p and larger in probability p. If we assume the number for which the sample points are larger than xp is Sþ, and the number for which the sample points are smaller than xp is S–, then Sþ ~ Bðn; pÞ; S ~ Bðn; 1 pÞ þ
n ]/ X ¼ ]n i¼1
f
cþ1 C0 1 cþ1 cþ1 n 2 2 ln b þ cþ1 n n n C n bðxi a0 Þn lnðxi a0 Þ
ð23Þ
þ
Here, S ~ B(n, p) implies that S follows a binomial distribution for which parameters are n and p. Consider the event A ¼ {X(k) xp X(l)}, then
AC ¼ xp is smaller than some XðiÞ ; i ¼ 1; ; k 1 ¨
xp is not smaller than some Xð jÞ ; j ¼ l þ 1; ; n
¼ Xð1Þ xp ¨ XðiÞ , xp and Xðiþ1Þ xp ; i ¼ 1; ; k 1 ¨
¼ XðnÞ xp ¨ Xð jÞ xp and Xð jþ1Þ , xp ; j ¼ l þ 1; ; n ¼ fS ¼ i; i ¼ 0; 1; ; k 1g¨fSþ ¼ j; j ¼ 0; 1; ; n lg
g
¼0
n X ]/ c ¼ þ nbðxi a0 Þn1 ¼ 0 ]a0 xi a 0 i¼1
ð26cÞ
ð26dÞ
Then by formula deformations, we have c þ 1 ¼ bnAn;a0 b ¼ bn;a0 ¼
ð24Þ
n n nh i o X n n ðxi a0 Þ An;a0 lnðxi a0 Þ
ð27Þ ð28Þ
i¼1 0
So let k1
X P AC ¼ i¼0
¼
¼
nl X n n ð1 pÞi pni þ ð1 pÞnj p j i j
k1 X i¼0
i
n X n ð1 pÞi pni þ ð1 pÞ j pnj j
bði; n; 1 pÞ þ
j¼lþ1
n X
bð j; n; 1 pÞ , a
ð25Þ
j¼lþ1
n C ðbn;a0 An;a0 Þ nX ¼ lnðxi a0 Þ n i¼1 Cðbn;a0 An;a0 Þ
n X n1 nb ðx a Þ i 0 n;a0 i¼1 þ 1 nbn;a0 An;a0 K ¼ n X 1 x a0 i¼1 i
j¼0
k1 X n i¼0
ln bn;a0
where
In actual practice, k and l are chosen to satisfy Pk1 P 1 p) , a/2 and nj¼lþ1 b( j; n, 1 p) , a/2. So k i¼0 b(i; n, P Pn max max ¼ 1sn {s: s1 i¼0 b(i; n, 1 p) , a/2}, and l ¼ 1tn {t: j¼tþ1 b( j;
Journal of Coastal Research, Vol. 30, No. 5, 2014
An;a0 ¼
n 1X ðxi a0 Þn : n i¼1
ð29Þ
ð30Þ
MED Interval Estimations
Figure 2.
The fitting figure of the MED for the data.
Because 0 a0 x(1), the a0 can be chosen from this interval. Suppose the value of a0 is given, and An;a0 could be determined by n. Then substitute Equation (28) into Equation (29); the new equation would include only one unknown parameter, n, and the estimation of n could be solved. Then, b can be calculated by Equation (28). Substituting these values of b, n, and a0 into Equation (27), the parameter c can be estimated. Since there are many groups of solutions, substitute those solutions into Equation (30), then choose the group (b, c, n, a0) that makes K the smallest as the estimation of the parameters for the respective MED of the sample x1, x2,. . ., xn. The maximum likelihood estimations of the parameters of the MED according to the annual extreme wave heights are ˆ cˆ ; n; ˆ aˆ 0 Þ ¼ ð0:0280; 0:4108; 3:1000; 0:8820Þ and the fitting ðb; result is as Figure 2.
Figure 3. Interval estimation with WM. The vertical line segments and the asterisk points denote the average interval estimations and theoretical point estimation of the 100 year wave height return period, respectively, and the numbers in the legends denote the frequency ratio intervals into which the theoretical return periods fall for different sample sizes by turns.
971
Figure 4. Interval estimation with MLM. The vertical line segments and the asterisk points denote the average interval estimations and theoretical point estimation of the 100 year wave height return period, respectively, and the numbers in the legends denote the frequency ratio intervals into which the theoretical return periods fall for different sample sizes by turns.
The K-S test does not reject this fitting, and its p-value is 0.717. Figure 2 shows that the MED with approximate parameters fits these annual extreme wave height data very well.
Numerical Simulation and the Comparison of Interval Estimation Methods First, a batch of random numbers x1 ; x2 ; ; xn is generated with fixed sample size n from the MED with parameters (b, c, n, a0) ¼ (0.0280, 0.4108, 3.1000, 0.8820) (which were solved above). Then, we evaluate the interval estimations of the 100 year wave height return period with WM, MLM, SQAM, OSM, and STM for the confidence level 0.05, respectively. We repeat this procedure 100 times and solve the average intervals and the ratios for which the theoretical return periods fall into these intervals. Then, we do this process for every sample size n (n ¼ 10, 20, 50, 100, 200). See Figures 3–7. Figures 37 indicate that if the sample size is larger than 100, the average interval estimations solved by OSM and STM (nonparametric methods) could almost include the theoretical return wave heights. As the sample sizes become larger, the widths of the intervals solved by MLM and SQAM become smaller, while the ones solved by OSM do not change very much, and the ones solved by STM even become larger. When the sample size is smaller than 50, the widths of the intervals solved by WM become smaller as the sample size becomes larger, but if the sample size is larger than 50, the widths do not change more. It is a common trend that the frequency ratios for almost all the interval estimation methods increase and almost approach to 1 as the sample sizes become larger. However, the frequency ratios for MLM and SQAM methods are oscillatory, although they are all larger than 0.8. The reason is that as the return period ^ xˆp in Equation (14) easily generates increases, xˆTp V negative values, and then values like NaN (not a number) appear in the interval estimations; also, g(xˆp) in Equation
Journal of Coastal Research, Vol. 30, No. 5, 2014
972
Dong et al.
Figure 5. Interval estimation with SQAM. The vertical line segments and the asterisk points denote the average interval estimations and theoretical point estimation of the 100 year wave height return period, respectively, and the numbers in the legends denote the frequency ratio intervals into which the theoretical return periods fall for different sample sizes by turns.
Figure 7. Interval estimation with STM. The vertical line segments and the asterisk points denote the average interval estimations and theoretical point estimation of the 100 year wave height return period, respectively, and the numbers in the legends denote the frequency ratio intervals into which the theoretical return periods fall for different sample sizes by turns.
(17) becomes very small, and then appears as infinite values. Because these values do not satisfy the requirement, they were removed during the calculation. The results indicate that the parametric interval estimation methods behave much better than nonparametric methods. However, when the sample size is sufficiently large (generally above 100), the nonparametric interval methods also give good results.
From these figures, one knows that when the return period is larger than 100 years, the estimated return wave heights that are solved by OSM and STM would not fall in the confidence intervals, but if the return period becomes smaller, the methods OSM and STM would behave much better. The parametric methods WM, MLM, and SQAM always behave well, whether the return period is large or small, and when the return period is larger than 5 years, the length of the confidence intervals solved by MLM is the smallest. The reason that the upper limits of the intervals of Figure 8 are equal is that when a ¼ 0.05, p ¼ 0.01, and the sample size is pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 34, d2 ¼ 1 p þ ua/2 pð1 pÞ=n is always larger than 1. The
Actual Results The interval estimations of the return wave heights near Weizhou Island with return periods N¼ 1000, 500, 200, 100, 50, 20, 10 (years) can be seen in the plots in Figures 8–12.
Figure 6. Interval estimation with OSM. The vertical line segments and the asterisk points denote the average interval estimations and theoretical point estimation of the 100 year wave height return period, respectively, and the numbers in the legends denote the frequency ratio intervals into which the theoretical return periods fall for different sample sizes by turns.
Figure 8. Interval estimation of the actual return wave heights with WM. The vertical line segments denote the interval estimation of the N year wave height return period.
Journal of Coastal Research, Vol. 30, No. 5, 2014
MED Interval Estimations
973
Figure 9. Interval estimation of the actual return wave heights with MLM. The vertical line segments denote the interval estimation of the N year wave height return period.
Figure 11. Interval estimation of the actual return wave heights with OSM. The vertical line segments denote the interval estimation of the N year wave height return period.
return period N and d2 are plotted in Figure 13. In Figure 13, d2 first increases and then decreases when the return period N increases. However, when 10 N 1000, d2 is always larger than 1, so the upper confidence limits in Equation (5) should all take the maximum value.
are larger than those solved by SQAM. As the sample size becomes larger (generally larger than 100), the nonparametric methods such as OSM and STM also behave very well in terms of the frequency and the length of the confidence intervals.
The interval estimations of the significant wave height return period can be regarded as the question of solving the confidence intervals of the corresponding quantiles of MED. The parametric methods of interval estimations (WM, MLM, and SQAM) generally behave much better than the nonparametric methods because of the higher frequency with which the theoretical quantiles fall into the intervals, but the widths of the confidence intervals solved by WM and MLM
The actual results from the data from Weizhou Island show that MLM gives the best interval estimation of the return wave heights. When the return period is larger than 100 years, the estimated return wave heights that are solved by OSM and STM would not fall into the confidence intervals, but if the return period becomes smaller, the methods OSM and STM would behave much better. The parametric methods WM, MLM, and SQAM always behave well whether the sample size is larger or smaller.
Figure 10. Interval estimation of the actual return wave heights with SQAM. The vertical line segments denote the interval estimation of the N year wave height return period.
Figure 12. Interval estimation of the actual return wave heights with STM. The vertical line segments denote the interval estimation of the N year wave height return period.
CONCLUSIONS
Journal of Coastal Research, Vol. 30, No. 5, 2014
974
Figure 13.
Dong et al.
The plotting of the return period and d2.
ACKNOWLEDGMENTS The study was partially supported by the National Natural Science Foundation of China (50879085), the National Program on Key Basic Research Project (2011CB013704), and the Program for New Century Excellent Talents in University (NCET-07-0778).
LITERATURE CITED Avramidis, A.N. and Wilson, J.R., 1998. Correlation-induction techniques for estimating quantiles in simulation experiments. Operations Research, 46, 574–591. ˆ K.M.; D´ıaz-Delgado, C., and Carsteanu, ˆ Ba, A., 2001. Confidence intervals of quantiles in hydrology computed by an analytical method. Natural Hazards, 24, 1–12. Borgman, L.E., 1963. Risk criteria. Journal of the Waterways and Harbors Division, 89(3), 1–35. Chow, V.T., 1964. Handbook of Applied Hydrology. New York: McGraw-Hill, 1468p. Coles, S., 2007. An Introduction to Statistical Modeling of Extreme Values. London: Springer-Verlag, 224p. Cramer, H., 1999. Mathematical Methods of Statistics. Princeton, New Jersey: Princeton University Press, 575p. Cunnane, C., 1989. Statistical Distributions for Flood Frequency Analysis. World Meteorological Organization Operational Hydrology Report 33, WMO No. 718. Geneva, Switzerland: World Meteorological Organization.
David, H.A., 1981. Order Statistics, 2nd edition. New York: Wiley, 384p. Dong, S.; Liu, W.; Zhang, L.Z., and Guedes Soares, C., 2009. Longterm statistical analysis of typhoon wave heights with Poissonmaximum entropy distribution. In: Proceedings of 28th International Conference on Ocean, Offshore and Polar Engineering (Hawaii, USA, OMAE 2009) OMAE–79278, Volume 2, pp. 189–196. Dong, S.; Tao, S.S.; Lei, S.H., and Guedes Soares, C., 2012. Parameter estimation of the maximum entropy distribution of significant wave height. Journal of Coastal Research, 29(3), 597–604. Dong, S.; Xu, P.J., and Liu, W., 2009. Long-term prediction of return extreme storm surge elevation in Jiaozhou Bay. Periodical of Ocean University of China, 39(5), 1119–1124. Ferreira, J.A. and Guedes Soares, C., 1998. An application of the peaks over threshold method to predict extremes of significant wave height. Journal of Offshore Mechanics and Arctic Engineering, 120(3), 165–176. Garc´ıa, M.M.; Cebria´n, A.A., and Rodr´ıguez, E.A., 1998. Quantile interval estimation in finite population using a multivariate ratio estimator. Metrika, 47, 203–213. Guedes Soares, C., 1989. Bayesian prediction of design wave heights. In: Thoft-Christensen, P. (ed.), Reliability and Optimization of Structural Systems. Berlin: Springer-Verlag, pp. 311–323. Guedes Soares, C. and Scotto, M.G., 2004. Application of the r largestorder statistics for long-term predictions of significant wave height. Coastal Engineering, 51, 387–394. Jaynes, E.T., 1968. Prior probability. IEEE Transactions on Systems Science and Cybernetics, 4, 227–241. Kite, G.W., 1977. Frequency and Risk Analyses in Hydrology. Reston, Virginia: Water Resources, 257p. Rao, A.R. and Hamed, K.H., 2000. Flood Frequency Analysis. Boca Raton, Florida: Science Press, 376p. Sun, S.Z., 2000. Lecture of Nonparametric Statistical Analysis. Beijing: Peking University Press. Tao, S.S.; Dong, S.; Lei, S.H., and Guedes Soares, C., 2011. Interval estimation of return wave height for marine structural design. In: The Proceedings of 30th International Conference on Offshore Mechanics and Polar Engineering (Rotterdam, the Netherlands, OMAE2011), OMAE49421, Volume 2, pp. 305–311. van Zwet, W.R., 1964. Convex Transformations of Random Variables. Mathematical Centre Tracts Volume 7. Amsterdam, the Netherlands: Mathematisch Centrum, 116p. Woodruff, R.S., 1952. Confidence intervals for medians and other position measures. Journal of the American Statistical Association, 47(260), 635–646. Xia, H.Y. and Li, S.H., 2001. An analysis on annual extreme wave height distribution along Guangxi coast. Journal of Tropical Oceanography, 20(2), 1–7. Zhang, L.Z. and Xu, D.L., 2005. A new maximum entropy probability function for the surface elevation of nonlinear sea waves. China Ocean Engineering, 19(4), 637–646.
Journal of Coastal Research, Vol. 30, No. 5, 2014