2k
+ 0.75
1
[(c 4/(2p _ 2k _
4
- --1» 4p - 1 cr 4p - 1 ],
if k + 0.5 < p :5 2k + 0.75
Moreover, the estimators achieve the optimal rates given by
o (er), 4
{ o (cr
if p > 2k + 0.75 4(4k
+ 1)
4p -
1 ), if k + 0.5 < P :5 2k + 0.75
(3.18)
- 16 -
Example 3.
(Inverse Problem)
Suppose that we are interested in recovering the
indirectly observed function 1 (t) from data of the fonn
II
l
u s
Y (u )
=
u
K (t, s)1 (t )dt ds
+ cr
dW,
U E
where W is a Wiener process and K is known. Let K: L 2 [0, 1)
[0, 1) ,
~
L 2 [0, 1) have a singular
system decomposition (Bertero et al (1982»: u
JK (t, u)1 (t)dt = j~ Aj(f, ~j) llj(u), where the {Aj} are singular values, the
{~j}
and {l1j} are orthogonal sets in L 2[0, 1).
Then
the observations are equivalent to
1
where Yj = [ llj (u )dY (u) is the Fourier-Bessel coefficient of the observed data,
aj = (f , ~j),
1
and
Ej
= [11
j(u)dW(u) are U.d. N(O, 1), i = 1, 2, ..' '. Now suppose that we want to esti-
mate I
f 1 2(t) b where
Xj
= Ai ai .
dt
= i a? = Dj- 2 x? ' I
1
If the non-parametric constraint on
a is a hyperrectangle, then the constraint
on x is also a hyperrectangle in ROO. Applying Theorem 2, we can get an optimal estimator in tenns of rate of convergence. Moreover, we will know roughly how efficient the estimator is if we apply Table 7.1 and 7.2. If, instead, the constraint on
a is a weighted Ip -body defined
by (1.4), then the constraint on x is also a weighted Ip·body, and we can apply the results in section 4 to get the best possible estimator in tenns of rate of convergence.
e
- 17 -
4
Extension to quadratically convex sets In this section. we will find the optimal rate of convergence for estimating the quadratic
functional 00
Q(x) =
L
AjX/
(4.1)
1
under a geometric constraint
{X'• .t.J '" Uj s::
't" "-'p --
I XJ' I P O)
achieves the optimal rate of convergence given by
if
a
~
2k + 0.25 (4.7)
o
if 2k + 0.25 > a> k
16(u - k) 4u+l ,
Now, let's give the optimal rate for the weighted lp -body constraints (p > 2). Assumption D. a)
limsup
n(3p - 4)1(2p)
'A.;o; 2Ip = liminf
/I 00
b)
~'l p/(P-2) ~r" 2I(p-2)
L/~j
n{3p - 4)1(2p)
'A.;o; 2IP.
/I
j
= 0 (n 'A. p /(P-2) 0- 2I(p-2» /I
/I
•
PI
PI
c)
~
L 'A.]p/(P-2) OJ 2I(p-2)= 0 (n 2(P-2) 'A.;p/(P-2) 0; 21(P-2», if limsup n (3p-4)12p 'A.;o; 21p = I
d)
0:/Pn 4lp - 1
PI
increases to infinite as
n
->
00.
00.
- 20 -
Theorem S. Under the Assumption B & D, the optimal rate of convergence of estimating Q(x) under the weight Ip-body constraint (4.2) is given by
if limsup n(3p - 4)1(2p) An20n- 2Jp
2) is
if r
~
0.75p - I + pq (4.10)
8(2(r+l) - p(q
0(0
4(r +
+
1»
1) - P
),
if P (q + 1)/2 - 1 < r < 0.75p - 1 + pq
no
Moreover. the truncated estimator
L
jq (y/ - ( 2) achieves the optimal rate of convergence,
1
where ncr = [Cd /04r p/(4r + 4 - P >], d is a positive constant.
Remark 1. Geometrically, the weighted Ip-body is quadratically convex ( convex in xl) when p
~
2, and is convex when 1 $ P 0 (Note that n depends on cr in the current setting). Comparing Lemma I and the Lemma 1 of the present paper, we find out that testing an n -dimensional sphere with the uniform prior against the origin is as difficult as testing the vertices (with uniform prior) of the largest inner hypercube of the sphere (Figure 1), Le. the sum of type I and type II errors of the best testing procedures for the both testing problems is asymptotically the same (as rn = {;lIn)' For simplicity of notations, assume without loss of generality that the largest n-dimensional hypersphere is anained at
and assume that we want to estimate a functional T(x) (not necessary a symmetric functional) based on observations (1.1) with T(O)
= O.
Let
be the vertices of the largest inner hypercube of Sn. Then, by Lemma 1 and Lemma I, choos-
ing the dimension n~ (the smallest one) such that (5.2)
the sum of type I and type II errors of testing problems (5.1) and (2.3) (with In
= r nI{;) is no
smaller than 4>(-d 1...J'8). Let R; = inf
I T (x) 1/2 be the half difference between the functional values on S n and
xeS"
the functional value on the origin. Similarly, let
R; =
inf I T (x) 1/2. Then according to the e"
x e
proof of Theorem 1, the lower bound of Ibragimov et al (1987) is that for any estimator O(y) based on the observations (1.1), (5.3)
- 24 -
and in tenns of MSE (5.4) S • by Our hypercubical lower bound is exactly the same as (5.3) and (5.4) except replacing Rno S RC.. Note that RC. ~ Rno •• The claim follows. no no
Note that we didn't use the fact that hypercubes are easier to inscribe into geometric regions than hyperspheres in the above argument; but use the fact that the difference of functional values on the sphere from the value of the origin is no larger than the difference of functional values on the vertices of the hypercube from the value of the origin, i.e.
R; : ; R;.
For
some geometric constraints I: (e.g. hyperrectangle), even though the functional may be spheri00
cally symmetric (e.g. T(x)
= L xl;
the differences of functional values
R; and R; defined
)
above are the same), one can inscribe an n -dimensional inner hypercube with the inner length
in (I:), but it is impossible to inscribe an
n -dimensional inner sphere with radius
in
in (I:) (Fig-
ure 2). Hence, the hypersphere method can not give an attainable lower bound in these cases (see Remark 3) even though the functional may be spherically symmetric.
Figure 2. The hypercube is easier than to Inscribe Into the hypercube than a hypersphere. It is impossible to inscribe a hypersphere with radius r n - I n Iff inside the hyperrectangle.
- 25 -
In summary, the hypercube method is strictly better than the hypersphere approach. Comparing with the hypersphere method. the hypercube method has the following advantages: i) larger in difference of functional values (R; 2: R;; broader application in terms of estimating functionals), ii) easier to inscribe ( Ifl (L) 2: rfl (L)/"ffi; broader application in tenns of constraints). Another advantage of our method is that it is easy to inscribe a hypercube into a nonsymmetric constraint as we only require the vertices of the hypercube lying in the constraint instead of the entire hypercube (see Example 5 below).
6.
Further Remarks. Remark 3: It appears that the hypersphere bound of Ibragimov et aI (1987) would not
give a lower bound of the same order as we are able to get via hypercubes for estimating the 00
quadratic
functional
Q(x)
=L
jq x/
with
the
hyperrectangular
constraint
1
L = {x
E
Roo: I Xj I
~
r
C
p
}.
The proof is simple. Assume that C = 1. The n-dimensional
inner radius of the hyperrectangle is r n (L)
= n -p.
By the Lemma 3.1 of Ibragimov et aI
(1987) (see Lemma I), the smallest dimension that we can not test the origin and the sphere with the unifonn prior consistently is n~ of no
- -4 -
= [(-{d 0)
4p - 1
J,
- -4 -
= [(-{d 0)
4p + 1 ]
(see (5.2», which is of small order
the dimension that we can not test the vertices of the hypercube
against the origin perfectly, where d is a positive constant. By Corollary 6 of Ibragimov et aI 4 _ _4_
(1987), the lower bound under the quadratic loss is of order
(R S.)2 flO
= 0 (0
4p +
1), which is
of lower order (R;)2 given by (3.17). It is clear that in the current setting a hypercube is o easier to inscribe into a hyperrectangle than a hypersphere, and hence hypersphere's method cannot give an anainable lower rate, while we can get the anainable lower bound via the hypercube method.
- 26 -
Remark 4. Using the method of Ibragimov et al (1987) to develop the lower bound for the weighted lp -body ( Aj
= jq , OJ = j'), we find that the lower bound is of order o ((J
which is not an attainable rate when q
;to
8(2r + 2 - p)
4r + 4 - P ),
O. Thus, the hypersphere method does not work in 00
the current sening. The reason for this is that the value of the functional Q (x) =
L
r xl
1
changes when x lies on an n-dimensional sphere (note that when q
= 0,
the hypersphere's
method can also give an attainable lower rate as Q (x) remains the same when x is on the sphere). Note that in the current setting, the largest n -dimensional inner hypercube lies in the largest n-dimensional hypersphere (see Figure I). Thus, the reason for our method to give a larger lower bound is not due to the fact that the hypercube is easier to inscribe than the hypersphere, but is due to the fact that for estimating a symmetric functional ((2.1)), the value of the functional remains the same when x is on vertices of the hypercube. 7.
Discussions a) Possible Applications We have demonstrated that for special kinds of constraints of the hyperrectangles and the
weighted lp -bodies (p
~
2), the difficulties of estimating quadratic functionals are captured by
the difficulties of the hypercube subproblems.
The notions inside the problems can be
explained as follows. For hypercube-typed hyperrectangles (Le. the lengths of a hyperrectangle satisfy Assumption A), the difficulties of estimating the quadratic functionals are captured by the difficulties of the cubical subproblems (Theorem 2). Now, for estimating quadratic functionals under the constraints of weighted lp -bodies (Theorem 4 & 5), the difficulties of the estimating quadratic functionals are actually captured by rectangular subproblems, which happen to be hypercube-typed. Thus, the difficulties of the estimating quadratic functionals under the weighted lp -body constraints (Theorem 4 & 5) are also captured by the cubical
- 27 -
subproblems. More general phenomena might be true: the difficulties of estimating quadratic functionals under quadratically convex (convex in xl) constraints
r
(see Remark 1) are cap-
tured by the hardest hyperrectangular subproblems: min max E%(q(y) - Q(x»2 ~ C max{ min max Ex(q(y) - Q(x»2 : 8('t) q (y)
X E
L
q (y)
X E
E
r},
8('t)
where 8('t) is a hyperrectangle with the coordinates 't, q (y) is an estimator based on our model (1.1), and C is a finite constant. For estimating linear functionals, the phenomenon above is true (Donoho et (1988».
We can apply our hypercube bound to a non-symmetric constraint, and also to an unsymmetric functional. Let's give an example involved the use of our theory. Example 5. (Estimation of Fisher Infonnation) Suppose that we want to estimate the Fisher infonnation 1
/ (j)
based
on
the
data
observed
2
= f I (x) 6 I(x) I
from
dx
(1.6)
under the
where
r
nonparametric
constraint that
1
I
E
r*
= r(l (I:
[ I (x) =
I,
I (x) ~ O}.
is a subset of Roo. Take the same orthogo-
nal set as Example 2. Let In (r) be the n -dimensional inner length of r (not largest n -dimensional inner hypercube into
r.
r:).
Inscribe the
The functions whose Fourier coefficients are on
the vertices of the hypercube are the set of functions n
In (x) = 1 + In (r)
:E :tel>j (x) 2
for all possible choices of signs ±, where
4>/t)
are sinusoid functions given by Example 2. If
nln (r) -> 0, all of these functions are positive and hence in the set of the our positivity constraints. By our hypercubical approach, for any estimator T (Y) based on the observations (1.6) (see the proof of Theorem I),
- 28 -
d (- --.{8)
=
r(l
[I (f,;(X» 2dx
4
d (- - )
=
{8
4
[C2Xl2/,2, (l:)
+ 0(1»)
fj'l'
(I
+ 0 (1 ».
where n a is the smallest integer satisfying (2.5). Thus, if the nonparametric constraint
~
is
defined as Example 4, then the minimax lower bound is at least as large as
if a;::: 2.25 16(a - 1)
o (a and for the hyperrectangle constraint ~
4a + 1 ),
if 2.25 > a> I
= {I Xj (j ) I :5 Cr
P },
the minimax lower bound is
p > 2.75 16p - 24
O(a
4p - I ),
if 1.5 < P :5 2.75
Thus, the lower bound for estimating Fisher information is at least as large as that of estimat1
ing [ [f '(x )]2 dx, and the lower bound is attainable if f is bounded away from O. Note that
~. is an unsymmetric set in this example and T(j) is an unsymmetric functional (see (2.1)). Our hypercube method can also apply to estimating an unsymmetric functionals with an unsymmetric positivity constraints. 1
The functional T(f) above can also be replaced by T(j) = [[f(k)(t)f dr.
- 29 -
b) Constants
By using the hardest I-dimension trick, we can prove the following lower bound (see Fan (1989) for details):
Theorem 6. If
~
is a symmetric, convex set containing the origin, then minimax risk of
estimating Q (x) from the observations (1.1) is at least sup X
and as cr
--7
E
(7.1)
r
0, for any estimator o(y), (7.2)
where p('t, 1)
= inf sup 't 191 $
£9(0(Z) 't
e2)2, and z
- N(e. 1).
00
Comparing the lower bound (7.2) and the upper bound (3.16) for estimating
"Lr x/
with
1
a hyperrectangular constraint {x E ROO: 1 ~ Lower Bound Upper Bound where Cr
=L r
r
IXj
~
I ~
Crp }. we have that when p
__C....;2~2p__q!..-_. (as cr _> 0). C 2p C 2p _ 2q
> q + 0.75,
(7.3)
can be calculated numerically. Table 7.1 shows results of the right hand
1
side in (7.3). From Table 7.1, we know that the best truncated estimator is very efficient. Thus, the difficulty of the hardest I-dimensional subproblem captures the difficulty of the full problem pretty well in the case p > q + 0.75.
- 30 -
Table 7.1: Comparison of the lower bound and upper bound p
= 1 + q + 0.5 i
i=1
i=2
i=3
i=4
i=5
i=6
i=7
i=8
q=O.O
1.000
1.000
1.000
1.000
1.000
1.000
1.000
1.000
=0.5
0.976
0.991
0.996
0.998
0.999
1.000
1.000
1.000
=1.0
0.940
0.977
0.990
0.995
0.998
0.999
0.999
1.000
=1.5
0.910
0.963
0.984
0.992
0.996
0.998
0.999
0.999
=2.0
0.887
0.952
0.979
0.990
0.995
0.998
0.999
0.999
=2.5
0.871
0.944
0.975
0.988
0.995
0.997
0.998
0.999
=3.0
0.859
0.938
0.972
0.987
0.994
0.997
0.998
0.999
=3.5
0.851
0.934
0.970
0.986
0.993
0.996
0.998
0.999
=4.0
0.845
0.931
0.968
0.985
0.993
0.996
0.998
0.999
=4.5
0.842
0.929
0.967
0.984
0.992
0.996
0.998
0.999
=5.0
0.839
0.928
0.966
0.984
0.992
0.996
0.998
0.999
When (q + 1)/2 < P $ q + 0.75, by comparing the lower bound (3.17) and the upper bound (3.15), again we show that the best truncated estimator attains the optimal rate. The following table shows how close the lower bound and the upper bound are. Table 7.2: Comparison of the lower bound and the upper bound q=O
q= 1
q=2
q=4
q=3
p=
ratio
p=
ratio
p=
ratio
p=
ratio
p=
ratio
0.55
0.0333
1.15
0.0456
1.75
0.0485
2.35
0.0495
2.95
0.0499
0.60
0.0652
1.30
0.0836
2.00
0.0864
2.70
0.0864
3.40
0.0858
0.65
0.0949
1.45
0.1159
2.25
0.1170
3.05
0.1153
3.85
0.1132
0.70
0.1218
1.60
0.1431
2.50
0.1419
3.40
0.1383
4.30
0.1345
h were
. rauo = ~ hypercube lower bound upper bound by qUT(Y)
- 31 -
The above table tells us that there is a large discrepancy at the level of constants between
+ 1)/2 < P ::;;
the upper and lower bounds for the case that (q
q
+ 0.75.
c) Bayesian approach The
following
will
r xl with the constraint x
00
Q (x) =
discussion
L 1
E
focus
on
estimating
1: = {x: IXj I ::;;
the
quadratic
functional
r p }.
It is well known that the minimax risk is attained at the worst Bayes risk. The traditional method of finding a minimax lower bound is using Bayesian method with an intuitive prior. However, in the current setting, all intuitive Bayesian methods jail to give an attainable (sharp in rate) lower bound. Thus, finding an attainable rate of estimating a quadratic functional is a non-trivial job. By an intuitive prior, we mean that assign the prior uniformly on the hyperrectangle, which is equivalent to that independently assign the prior uniformly on each coordinate, or more generally we mean that assign the prior Xj - 1tjCa) independently. To see why the Bayesian method with an intuitive prior can not give an attainable rate, let Prc.U- P, 0) be the J
Bayes risk of estimating
[- r r p
,
p
].
a2
from Y - N
(a, 0 2)
with a prior 1tjCa) concentrated on
Then, (7.4)
as the Bayes risk is no larger than the maximum risk of estimators 0 and y 2• 00
Let
00') = E (L r
xl I y) be the Bayes solution of the problem.
1
assumption of the prior, 00
O(y)
=L r 1
Thus, by (7.4) the Bayes risk is
E(xlIYj).
By independent
- 32 -
Ere Ex (S(Y) -
'L r Xl)2 = 'L j2q 00
00
1
1
n
:5:
L j2q
Pre.(
r
p
cr)
,
J
(3cr4 + 4cr2r
00
2p
)
+
1
L
j2q -
4p,
n+l
=0 (cr4 -
(2q + l)1p
+ a2)
4 _ 4(2q + 1)
=o(cr
(when(q
4p-l),
+ l)/2
0
"\In,,o o noa
2
< 00.
4 Thus, we conclude that the truncated estimator qUT(Y) achieves the rate n of..;oa .
stays
- 38 -
To prove the the rate is the best attainable one, we need to show that no estimator can estimate Q (x) faster than the rate given by (4.4). For the first case, by Theorem 6, we know that 0
2 (0 )
is the optimal rate. To prove the second case, let us inscribe an n-dimensional
inner hypercube. The largest inner hypercube we can inscribe into the weighted I 2-body is the hypercube, which is symmetric about the origin, with length 2A n , where A n2
n
= C ICfl>j),
i.e.
I
the n -dimensional inner length In (L)
= An.
Some simple algebra shows that when 0 is small,
for some D > O. By Theorem 1, we conclude for any estimator T(y),
x
where,no
= A no2
sUR Ex (T(y) - Q (x ))2 E ~
~