Feb 20, 2013 ... Ansgar Steland. Ewaryst Rafaj lowic. The work is supported by the German
Federal Ministry of the Environment, Nature and Nuclear Safety, ...
Estimation of the quantile function using Bernstein-Durrmeyer polynomials
Andrey Pepelyshev Ansgar Steland Ewaryst Rafajlowic
The work is supported by the German Federal Ministry of the Environment, Nature and Nuclear Safety, grant 0325226.
Hamburg February 20, 2013
1/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Contents
Contents
Bernstein-Durrmeyer operator Properties of the BD estimator of the distribution function Properties of the BD estimator of the quantile function The adaptive choice of N Application in photovoltaics
2/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Bernstein-Durrmeyer operator
Bernstein-Durrmeyer operator
The BD operator DN (u(x)) has the kernel form 1
Z DN (u(x)) =
KN (x, z)u(z)dz 0
where KN (x, z) = (N + 1)
PN
i=0
(N )
Bi
DN (u(x)) = (N + 1)
(N )
(x)Bi
N X
(N )
ai Bi
(z), or
(x)
i=0
whereZ ai = (N )
Bi 3/23
1
(N )
u(x)Bi
(x)dx,
0
(x) =
N! xi (1 i!(N −i)!
− x)N −i .
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Bernstein-Durrmeyer operator
BD operator for deterministic functions
Derriennic (1981) shown that DN (u) converges uniformly,
sup |DN (u(x)) − u(x)| ≤ 2ωu (N −1/2 ), x∈[0,1]
where ωu denotes the modulus of continuity of u. Chen and Ditzian (1991) established the fact that
ku(x) − DN (u(x))kp = O(N −δ/2 ), if and only if
(u) =
inf
a0 ,...,aN
n o ku(x) − a0 − a1 x − · · · − aN xN kp = O(N −δ ),
as N → ∞ for some δ ∈ (0, 2) and p ∈ {2, ∞}, where 1/p R kf kp = I |f (x)|p dx . 4/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Bernstein-Durrmeyer operator
BD operator for the distribution function
Let Fm (x) be the empirical d.f. for a sample X1 , . . . , Xm ∼ F[0,1] with density f and m
1 X fem,N (x) = KN (Xj , x), Fem,N (x) := m j=1
Z
x
fem,N (t)dt 0
Theorem. (Ciesielski, 1988) If F ∈ C[0, 1], then
Pr(kFem,N (x) − F k∞ → 0 as m, N → ∞) = 1. If f ∈ Lp [0, 1] and m = bN β c for some β ∈ (0, 0.5), then
Pr(kfem,N (x) − f kp = o(1) as N → ∞) = 1. 5/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Bernstein-Durrmeyer operator
BD operator for the quantile function
Let Qm (x) be the empirical q.f. for a sample X1 , . . . , Xm . Version 1: N X (N ) em,N (x) := DN (Qm (x)) = (N + 1) Q e ai Bi (x), i=0
Z e ai =
1
(N )
Qm (x)Bi
(x)dx,
i = 1, . . . , m,
0
Version 2:
bm,N (x) = (N + 1) Q
N X
(N )
b ai Bi
(x),
i=0
b ai = Note that 6/23
e ai
1 m
m X
(N )
X(j) Bi
j=1
are asymptotically equivalent to A. Pepelyshev, A. Steland
(
j−1 ) m−1
e ai
in mean square (Yang, 1985).
Bernstein-Durrmeyer quatile estimation
Bernstein-Durrmeyer operator
Properties of the BD estimator Lemma. For e ai it is hold
Var(e ai ) ≤ Proof. Set
1 max Var(X(j) ) (N + 1)2 j=1,...,m R j/m
w ej =
(N ) B (t) dt (j−1)/m i , Pm R l/m (N ) (t) dt l=1 (l−1)/m Bi
j = 1, . . . , m. Then we have that Var(e ai ) =
m Z X l=1
l/m
!2 (N ) Bi (t) dt
(l−1)/m
E
m X
(X(j) −EX(j) )2 w ej .
j=1
Apply the Jensen inequality (Eζ (aζ − Eaζ ))2 ≤ Eζ (aζ − Eaζ )2 , where ζ is a random variable such that P(ζ = j) = wj . 7/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Bernstein-Durrmeyer operator
Properties of the BD estimator Lemma. For the BD estimator it is hold
em,N (x)) ≤ Cm := max Var(X(j) ) Var(Q j=1,...,m
and
Z
1
em,N (x))dx ≤ Cm . Var(Q 0
If the derivative of Q(x) is continuous, then Cm = O(1/m). Proof. If X1 , . . . , Xm ∼ U (0, 1) then X(j) ∼ β(j, m − j + 1). Note that the variance of the beta distribution β(a, b) is ab . Therefore, it is easy to see that Cm = O(1/m). (a+b)2 (a+b+1) 8/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Bernstein-Durrmeyer operator
Maximum variance of order statistics
Since Q0 (x) is continuous, then we can write X(j) = Q(U(j) ). Applying the LagrangeTaylor formula for Q(U(j) ) at the point EU(j) = j/(m + 1) = pj , we obtain
X(j) = Q(pj ) + (U(j) − pj )Q0 (ζj ) for some ζj ∈ [min{U(j) , pj }, max{U(j) , pj }]. Then we have
Var(X(j) ) = Var((U(j) − pj )Q0 (ζj )) ≤ max Q02 (x)E(U(j) − pj )2 x
pj (1 − pj ) = max Q02 (x). x m+2
9/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Consistency
MSE and MISE consistency
Let X1 , . . . , Xm be m random variables with common quantile function Q(x) with a continuous derivative on [a, b] ⊂ [0, 1] that satises the Chen-Ditzian condition with p = ∞. Then the Bernstein-Durrmeyer estimator with integral weights, em,N (x), satises Q
em,N (x) − Q(x))2 = O(1/m + N −δ ) max E(Q
x∈[a,b]
and
Z E
b
em,N (x) − Q(x))2 dx = O(1/m + N −δ ) (Q
a
as m → ∞ and N → ∞. 10/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Consistency
The BD estimator with correction term
em,N (x) . Let eel (x) = Dl Qm (x) − Q Dene the error-corrected Bernstein-Durrmeyer em,N,l (x) = Q em,N (x) + eel (x). Q
estimator
Theorem. Suppose X1 , X2 , . . . are i.i.d. random variables
with common quantile function Q(x) and distribution function R F (x) satisfying |x|p dF (x) < ∞ for some 1 ≤ p ≤ ∞. Then em,N,l (x) is consistent in the pth the error-corrected estimator Q mean for the true quantile function Q(x), em,N,l − Qkp → 0, kQ as m → ∞ and N, l → ∞, with probability 1, and in the mean, i.e. em,N,l − Qkp → 0, EkQ as m → ∞ and N, l → ∞. 11/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Adaptive choice of N
Adaptive choice of
N
We modify an algorithm from (Golyandina, Pepelyshev, Steland, 2012) whose idea is based on a balance of the closeness of the BD estimator to the empirical q.f. in terms of the law of iterated logarithms and the stability of the number of modes of the corresponding density estimator. (N )
Note that the Bernstein polynomial Bi (x) can be approximated by the density p of the normal distribution φ((i/N − x)/s), where s = x(1 − x)/N and 2 φ(x) =√(2π)−0.5 e−x /2 . Thus, we can introduce the quantity h = 1/ N which plays the role of the bandwidth. 12/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Adaptive choice of N
Algorithm of adaptive choice of
N
1] Compute n ¯h = max h ∈ (0, 1] : o bm,d1/t2 e (q)) − q ≤ 1/Rm ∀ t ∈ (0, h) max Fm (Q ∞ q
where Fm (x) is the empirical distribution function, dze is the smallest integer that is larger or equal to z , and √ p Rm = 2 m/ 2 log log m.
13/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Adaptive choice of N
Algorithm of adaptive choice of
N
¯ such that 2] Dene a set {h1 , . . . , hn } ∈ (0, h) maxh∈(0,h) ¯ minj=1,...,n |h − hj | is small and compute the sequence M1 , . . . , Mn , where Mj is the number of local bm,N (x) minimums of the Bernstein-Durrmeyer estimator Q with N = d1/h2j e. ˇ j = min{M1 , M2 , . . . , Mj }, j = 1, . . . , n. 3] Compute M
14/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Adaptive choice of N
Algorithm of adaptive choice of
N
4] Divide the set {h1 , . . . , hn } into groups as follows. Dene ai and bi such that a1 ≤ b1 < a2 ≤ b2 < . . . < ak ≤ bk and ˇi = M ˇ j for all hi , hj ∈ [al , bl ] for l ∈ {1, . . . , k}. M
P 5] Compute b h = ki=1 ai wi , where P b = d1/b wi = (bi − ai )/ kj=1 (bj − aj ), and then set N h2 e.
15/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Adaptive choice of N
Consistency of the BD estimator with
Nˆ
Let X1 , . . . , Xm be m random variables with √the continuously 2 m dierentiable quantile function Q, Rm = √2 log . log m Then the adaptive Bernstein-Durrmeyer estimator b b (q) = D b (Qm (q)) where N b is selected by the above Q m,N N algorithm is consistent as m → ∞. Moreover, we have the a.s. uniform error bound p √ b−1 (x) − F (x)| ≤ 2 log log m/ m sup |Q b m,N −∞ c) is the probability of the lot acceptance for given p, the true proportion of non-conforming items.
19/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Application in photovoltaics
Requirements for the sampling plan
The sampling plan is a minimal solution satisfying the following conditions ( OCn,c (p) ≥ 1 − α for p ≤ AQL, OCn,c (p) ≤ β for p ≥ RQL
20/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Application in photovoltaics
Sampling plans for quality control
If measurements are distributed according to a distribution G(x) with mean a and variance σ 2 , then √ OCn,c (p) ∼ = 1 − Φ c + nG−1 (1 − p) . The asymptotically optimal sampling plan (n, c) is & 2 ' Φ−1 (α) − Φ−1 (1 − β) n = 2 , F −1 (AQL) − F −1 (RQL) √ n −1 F (AQL) + F −1 (RQL) , c = − 2 where F (x) = G((x − a)/σ) and Φ(x) is the standard normal distribution. 21/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation
Application in photovoltaics
Numerical results
Characteristics of estimation of the sampling plan (n, c) for two quantile estimators, α = β = 5%, AQL = 2% and RQL = 5%.
m 200 400 800 1600
22/23
Bernstein-Durrmeyer estimator n c mean std.dev. mean std.dev. 57.9 24.8 13.3 2.3 57.9 19.9 13.5 1.9 55.7 14.0 13.4 1.5 54.3 9.9 13.4 1.0
A. Pepelyshev, A. Steland
Empirical quantile estimator n c mean std.dev. mean std.dev. 84.0 178.3 14.7 7.7 59.3 40.0 13.5 3.8 54.1 23.4 13.2 2.6 53.9 18.2 13.3 1.9
Bernstein-Durrmeyer quatile estimation
Application in photovoltaics
Thank you for your attention!
23/23
A. Pepelyshev, A. Steland
Bernstein-Durrmeyer quatile estimation