IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012
6697
Consistent Nonparametric Regression for Functional Data Under the Stone–Besicovitch Conditions Liliana Forzani, Ricardo Fraiman, and Pamela Llop
Abstract—In this paper, we address the problem of nonparametric regression estimation in the infinite-dimensional setting. We start by extending the Stone’s seminal result to the case of metric spaces when the probability measure of the explanatory variables is tight. Then, under slight variations on the hypotheses, we state and prove the theorem for general metric measure spaces. From this result, we derive the mean square consistency of the -NN and kernel estimators if the regression function is bounded and the Besicovitch condition holds. We also prove that, for the uniform kernel estimate, the Besicovitch condition is also necessary in consistency for almost every . order to attain Index Terms—Functional data, nonparametric regression, separable metric spaces.
I
I. INTRODUCTION
N the functional data context, the problem of infinite-dimensional linear regression has been popularized by Ramsay and Silverman [24]. Since then, many authors have worked on this topic. Among them, we refer to [2], [7], [8], [11], [15], [17], [18], and [19]. In the nonparametric framework, there exist several results about consistency of regression estimators. Most of them are for specific estimates (like -NN and kernel estimators) and in some cases are given only for classification problems. For the kernel estimator, in [14], the authors provide consistency results assuming continuity of the regression function , boundedness of the moments of the response and a condition on the distribution of the process (the so-called small ball probability assumption). In addition, if the regression function is Lipchitz, they also obtain rates of convergence of the estimator. When the response is binary (classification), Biau et al. [3] proved the weak consistency of the -NN classification rule in separable Hilbert spaces. They used a filtering method to reduce the dimension of the space and then applied the Stone’s consistency theorem in finite dimension (see [26]). On the other hand, for separable metric spaces, Cérou and Guyader [9] proved the Manuscript received November 12, 2011; revised May 10, 2012 and July 03, 2012; accepted July 12, 2012. Date of publication August 03, 2012; date of current version October 16, 2012. This work was supported by PICT2008-0921, PICT2008-0622, PI 62-309, and PIP 112-200801-0218. L. Forzani and P. Llop are with the Facultad de Ingeniería Química and Instituto de Matemática Aplicada del Litoral, Universidad Nacional del Litoral—Consejo Nacional de Investigaciones Científicas y Técnicas, S3000GLN Santa Fe, Argentina (e-mail:
[email protected];
[email protected]). R. Fraiman is with the Departamento de Matemática y Ciencias, Universidad de San Andrés, 1644 Victoria, Argentina, and also with the Centro de Matemática, Universidad de la República, 11300 Montevideo, Uruguay (e-mail:
[email protected]). Communicated by N. Cesa-Bianchi, Associate Editor for Pattern Recognition, Statistical Learning, and Inference. Digital Object Identifier 10.1109/TIT.2012.2209628
weak consistency of that classification rule under some regularity condition on the regression function with respect to , the probability measure of the random element . This condition is called the Besicovitch condition (1) in probability, where is the closed ball of center and radius . Abraham et al. [1] showed that the moving window classification rule is not consistent in general metric spaces and gave conditions on the space and the regression function to ensure the strong consistency of the estimator. These conditions are the existence of an increasing sequence of totally bounded subsets such that , a condition on the bandwidth related to the covering numbers , and the Besicovitch condition. For general regression functions, recently Biau et al. [4] have presented counterexamples showing that the -NN estimator is not consistent in general metric spaces. However, they found rates of mean square convergence for Lipchitz regression functions in separable Banach spaces under somewhat restrictive conditions. More precisely, they require that the support of the underlying distribution be totally bounded, and that the smallest radius such that there exist open balls of this radius , be finite. As mentioned by the covering the set authors, in infinite-dimensional spaces closed balls are not tomost of the time. However, tally bounded so that they provide a nice set of examples where the condition holds is infor some family of distributions like the case where cluded and bounded in the space of times differentiable functions with bounded derivatives. As we have seen in the previous references, the Besicovitch condition is at the heart of most consistency proofs for classical nonparametric estimates (like -NN and kernel based estimates). It clearly combines conditions on the regression function and the underlying distribution of the explanatory variables. If is continuous, then nothing is required to , while if is not continuous some requirement on (like homogeneity condition) is necessary in order for the Besicovitch condition to hold (see [16], [22], and [27]). In finite dimension, it holds automatically for any integrable function , since it is just the Differentiation Theorem with respect to a finite measure (see, for instance, [28p. 189]). Although the result is no longer true in infinite-dimensional spaces, it holds in a general setting if the function is continuous. Several authors have attempted to generalize this theorem to general metric spaces obtaining important results (see, for instance, [16], [22], and [27] and the references therein).
0018-9448/$31.00 © 2012 IEEE
6698
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012
In this paper, we focus in providing consistency results under weak assumptions which are easy to understand. We start extending Stone’s seminal result (see [26]) to the case of locally compact metric spaces. Then we extend the result to the case of metric spaces equipped with a measure which is tight. Next, we state the theorem for the case when the regression function is somewhat regular, and later we present a result that, under slightly modifications on the hypothesis, holds for metric measure spaces in general. As a consequence, we derive the consistency of the -NN and kernel-based estimators when the Besicovitch condition holds and the regression function is bounded. We also prove that, for the uniform kernel estimate, the Besicovitch condition is necessary in order to attain consistency for almost every .
Theorem 3.1: Suppose the assumptions H1–H3 hold for a locally compact metric space. Let be a sequence of weights satisfying the following three conditions: i) For all
ii)
iii) There is a constant measurable function
such that, for every nonnegative with
II. SETTING AND NOTATION Let be a metric space and let be independent identically distributed (i.i.d.) random elements with the same law as the pair fulfilling the model
Then,
is mean square consistent, i.e.,
(2) where the error satisfies and . In this context, the regression function be approximated by
can
(3) where the weights are nonnegative and . To state the main result of this study, the following set of assumptions will be needed: H1 is a metric space; H2 the probability measure of is a Borel measure; H3 are i.i.d. random elements with the same distribution as the pair which satisfies model (2). Through all this study, the symbols , or without subscript will indicate that these measures are computed over all the random elements. Otherwise, the subscript will indicate the random element over which we compute them. The symbol will denote generic constants whose value may be different in each occurrence. The sequences and will be denoted by and , respectively. With a slight abuse of notation, we will write instead of to denote the metric space. III. STONE’S THEOREM EXTENSION In this section, we state Stone’s consistency theorem [26] for locally compact metric spaces. Moreover, under a slight modification of Stone’s conditions we obtain the result for general metric spaces. All proofs of this section will be given in Appendix A.
Let us observe that these hypotheses are exactly the same as the ones given in [26] for finite-dimensional spaces except that to extend the result to the infinite-dimensional setting we need to ask the space to be locally compact. However, most of the spaces of our interest are not locally compact. In particular, Riesz (1918) showed that a Banach vector space is locally compact if and only if it is finite-dimensional (see, for instance, [10]). Nevertheless, in the proof of Theorem 3.1 the locally compactness is only used to approximate in by a uniformly continuous and bounded function. Therefore, we can state the result for nonlocally compact spaces if we ask to be -strongly regular. Definition 3.1: Given a metric space and a Borel probability measure on we say that the function is -strongly regular if for every there exists uniformly continuous and bounded such that . A metric space is -strongly regular if every function is -strongly regular. With this notion of regularity, we state the following weaker result: Theorem 3.2: Let us suppose that the assumptions H1-H3 hold and let be a sequence of weights satisfying conditions (i)–(iii) of Theorem 3.1. If the regression function is -strongly regular, then
Let us observe that if is regular with respect to the family of compact sets (a Radon probability measure), then the metric space is -strongly regular since in this case, every function can be approximated by a continuous with bounded support function (and therefore also by a uniformly continuous function), see [23]. Moreover, since is a probability measure we have the following lemma.
FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA
Lemma 3.1: Let be a probability measure. Then, is -strongly regular if and only if is tight. As a consequence of this lemma, we have that, if the metric space is complete and separable, then it is -strongly regular. This is due to the fact that in a complete and separable metric space, every probability measure is tight (see [5, Th. 1.4, p. 10]). In Theorem 3.2, for specific values of the weights , hypothesis (iii) is the most difficult to check. However, in its proof, we use (iii) with to show that
6699
i) There is a sequence of nonnegative random variables a.s. such that
ii)
iii) For all , there exists such that for any bounded and continuous function fulfilling we have that Therefore, it is possible to replace (iii) by the following weaker condition easier to check in practice: (iii’) For all , there exists such that for any uniformly continuous and bounded function verifying we have
All these observations together give the following weaker version of Theorem 3.2. Theorem 3.3: Let us suppose that the assumptions H1-H3 hold. Let be a sequence of weights satisfying conditions (i) and (ii) of Theorem 3.1 and condition (iii’) above. If the regression function is -strongly regular, i.e., tight, then
then
Corollary 3.2: Under assumptions H1–H3, if is continuous and bounded on and is a sequence of weights which satisfies conditions (i’) and (ii) of Theorem 3.4, then is mean square consistent. Corollary 3.3: Let us suppose that the assumptions H1-H3 hold and . Let be a sequence of probability weights satisfying conditions (i’), (ii), and (iii”) of Theorem 3.4. If is a sequence of weights such that and, for each , for some constant , then the estimator defined by is mean square consistent. Here, means that, for each , and . IV.
Although we have replaced condition (iii) by the weaker one (iii’), this is still difficult to check for particular examples. However, if we ask the regression function to be uniformly continuous and bounded it is not required anymore as we state in the following corollary.
is mean square consistent.
-NEAREST NEIGHBOR ESTIMATE CONSISTENCY
In this section, we use Theorem 3.4 to prove the mean square consistency of the -nearest neighbor (from now on -NN) estimator given by
Corollary 3.1: Under assumptions H1–H3, if is uniformly continuous and bounded on and is a sequence of weights which satisfies conditions (i) and (ii) of Theorem 3.1, then is mean square consistent. Any square integrable function can be approximated in by a continuous and bounded function (but not necessarily uniformly continuous function since not every probability measure is tight). Nevertheless, we can extend our result to any function (and any metric space) if we replace condition (i) of Theorem 3.2 by (i’) (given below) which is stronger in two ways: instead of a constant value , we consider a sequence converging to zero and, the more important one, the sequence depends on .
is the -nearest neighbor of among . In case of ties, usually one attaches independent uniform random variables to and break those ties by comparing the values of and . We will assume that there are no ties with probability one (the general case can be handled as in [12]).
Theorem 3.4: Under assumptions H1-H3, if and is a sequence of weights satisfying the following conditions:
Theorem 4.1: Let us suppose that the assumptions H1–H3 hold with a separable metric space, and be a bounded function satisfying the Besicovitch condition (1).
with weights if otherwise.
(4)
where
6700
If
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012
is a sequence of positive real numbers such that , then the -NN estimator is mean square consistent. Remark 4.1: a) The classification rule considered by Cérou and Guyader [9] is bounded and in , therefore, their result (they also ask for the Besicovitch condition) is a consequence of Theorem 4.1. b) Any continuous and bounded function verifies the Besicovitch condition; therefore, under the hypothesis that is continuous and bounded, the -NN estimator is mean square consistent in any separable metric space.
V. KERNEL ESTIMATE CONSISTENCY In this section, we use Theorem 3.4 to prove the mean square consistency of the kernel regression estimate for functional regression given by
with weights
if if being a regular kernel, i.e., there are constants .
(5) is a kernel function for which such that
Theorem 5.1: Let us suppose that the assumptions H1–H3 hold with a separable metric space, and let be a bounded function satisfying the Besicovitch condition (1). If is a regular kernel, then there is a sequence a.s. for which the kernel estimator is mean square consistent. Remark 5.1: a) It is possible to prove the consistency of the kernel estimator for every sequence a.s. such that , where is the closed ball of center and radius . b) The classification rule considered by Abraham et al. [1] is bounded and in and therefore their result (they also ask for the Besicovitch condition) is a consequence of Theorem 5.1 (moreover, they require a stronger condition than separability). c) Any continuous and bounded function in verifies the Besicovitch condition; therefore, under the continuity and boundedness of , the kernel estimator is mean square consistent in any separable metric space for at least a sequence . Another important result that we could add is the necessity of the Besicovitch condition in order to achieve a slightly stronger result than mean square consistency of uniform kernel estimates as it is stated in the following proposition.
Proposition 5.1: Under assumptions H1–H3 with a separable metric space, if is the uniform kernel then, the Besicovitch condition (1) is necessary in order to attain consistency for almost all , i.e., for almost all . VI. EXAMPLES AND COUNTEREXAMPLES -strong regularity: Let be a metric space and a Borel probability measure on . As mentioned before, the space is -strongly regular if and only if the probability measure is tight. If is not tight, it follows easily that is not -strongly regular. An example of a measure which is not tight is given in [6, Example 7.1.6] (see also [5, Exercise 1.13]): let be a nonmeasurable subset of the interval with zero inner measure and unit outer measure. We consider with the usual metric as a metric space. Then, every Borel subset of this space has the form , where is a Borel subset in . We define a measure on by , where is Lebesgue measure (i.e., is the restriction of to ). Since the Lebesgue measure is regular, the measure is regular as well (we recall that the closed sets in are the intersections of with closed subsets of ). But it is not tight, since every compact set in the space is also compact in and hence, by construction, has Lebesgue measure zero; hence, we obtain . Examples where Besicovitch condition does not hold: 1) Preiss [21] provides an example of a Gaussian measure defined in the Hilbert space and a bounded measurable function , such that
2) Preiss [20] exhibits a Gaussian measure defined in and a bounded measurable function such that, for almost every the limit
does not exists. 3) Preiss [20] (see also [9]) provides an example of a Gaussian measure defined on a Hilbert space such that there exists a measurable set , such that
Therefore, if we take
, we have that
while is -strongly regular. Examples where Besicovitch condition holds: 1) In every locally compact metric space , the Besicovitch condition holds for every integrable function and any Borel probability measure . However, if is a locally compact
FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA
Banach space, Riesz showed that is a finite-dimensional space (see, for instance, [10]). 2) An example of a nondegenerated Gaussian measure in such that
For this choice of
for almost every , for every , , is given in Cérou and Guyader [9], due to Preiss and Tiser [22]. More precisely, for a given nonnegative sequence , such that , consider the space
and the measure (the countable product of the standard normal distribution on ). They show that if there exists such that , the Besicovitch condition holds for all .
6701
, we write
(7)
where, in the first term, we have used (6) and , and in the third term we have used condition (iii’). Since is uniformly continuous, for each there exists such that if , then for all . For that , using the boundedness of , we get
APPENDIX A PROOF OF THE MAIN RESULTS We will use the short notation from to the th sample element
,
to denote the distance .
Proof of Theorem 3.2: This proof follows the general lines of Stone’s Theorem given for the finite-dimensional case (see [13]). Let us consider the auxiliary function . Since
(8) for due to (i). Replacing in (7) we get from what part follows. In order to deal , let us observe that by the conditional independence
it will be sufficient to prove that Since
and
converges to zero.
which implies
and by Jensen’s inequality
Let and with given in condition (iii’). Since is -strongly regular for all , there exists a uniformly continuous and bounded function such that (6)
for , where in the last inequality, we use condition (ii) and the result follows taking .
6702
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012
Proof of Corollary 3.1: We replace edness of in the proof of Theorem 3.2.
by
in the bound-
Proof of Theorem 3.4: In order to get the result, we need only to redo the boundedness of (8) in the proof of Theorem 3.2 with continuous and bounded. For , the sequence given in (i’), we write
In addition, to prove Theorem 4.1 we will also need the following technical results whose proofs are given in Appendix B. Lemma A.2: For each , there is a sequence
, such that
and
. Lemma A.3: Let be a metric space, be fixed and the weights given by (4). Then, for all nonnegative bounded measurable function
Therefore, it will be sufficient to prove that and to zero. By the boundedness of , we have that
converge
which converges to zero by (i’). Now, let us consider . Since , is equal to
Lemma A.4: Let be a separable metric space, a Borel probability measure and a random sample of . If and then, a.s. Moreover a.s. whenever . Proof of Theorem 4.1: We will prove that the weights given by (4) satisfy conditions (i’), (ii), and (iii”) of Theorem 3.4. To prove (i’), let us consider fulfilling the conclusion of Lemma A.2 and let . Then
(9) Let be fixed. Since is continuous, for all , there exists such that if then . For that , there exists such that if then, . Therefore, for a given , there exists such that if , . In addition, by the boundedness of we have that then, the dominated convergence theorem implies that
Proof of Corollary 3.2: We replace Theorem 3.4.
by
in the proof of
In what follows, and will denote the distance from to its -nearest neighbor, and the closed ball of center and radius , , respectively. To prove Theorems 4.1 and 5.1, we will need the following lemma whose proof will be given in Appendix B. Lemma A.1: Let
Let
be fixed. Then
Therefore, it will be sufficient to prove that . For that, let us observe that and only if
if
then, it will be equivalent to show that
be a separable metric space. If , then
(10)
. Corollary A.1: Let be a separable metric space. Then, for almost every , for all , .
Since
, for large enough we have that from which it follows that
.
FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA
Therefore, from Hoeffding inequality and the fact that , we get
Finally, since the right-side hand of this inequality does not depend on , we can apply the dominated convergence theorem and conclude that
6703
Therefore, the proof will be completed if we show that the expectation with respect to of these three functions converges to zero. For this, let , and . Since is continuous, there exists such that if then . On the other hand, since by Lemma A.4 , for and for that , there exists such that if , . Then, converges to 0 for almost all and it is also bounded, therefore by the dominated convergence theorem it follows that . In addition, the boundedness of implies the boundedness of then, using the dominated convergence theorem again we have that
On the other hand, as Condition (ii) is trivial since condition (iii”) holds. Given that measure, for all , there exists such that have
. It remains to prove that and is a Borel continuous and bounded . From Lemma A.1, we
Finally, since
is bounded
which converge to zero if the bounded random variables converge to zero in probability. To see this, fix
Let get
. For every
be fixed, using the conditional trick again we
(11) The Besicovitch condition (1) implies that the second term converges to zero, while the convergence to zero of the first term follows from Lemma A.4.
Applying Lemma A.3 with we get
in (11),
We will use the short notation to denote the ball of center any element and radius , . To prove Theorem 5.1, we will need the following technical lemmas which are proved in Appendix B. Lemma A.5: For each of positive real numbers Lemma A.6: For each
, there exists a sequence such that , if
. then
Lemma A.7: Let be a metric space, be fixed and the weights given by (13). Then, for all nonnegative measurable function , we have (12)
6704
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012
Proof of Theorem 5.1: We take fulfilling the conclusion Lemma A.5 and denote by supp . For and for the sequence , let be the weights given by (5), and the weights defined by if if
(13)
Since is regular, there are constants that, for each
which entails that condition (i’) is fulfilled. In order to prove that weights satisfy condition (ii), we write
For
fixed, we define
such
if if Since
From this inequality, it follows that and in this case
, we have that
if and only if
Therefore, it will be sufficient prove that
which entails
To do this, we write (15)
Therefore, by Corollary 3.3, it suffices to prove that the weights satisfy conditions (i’), (ii), and (iii”) of Theorem 3.4. To . Then prove (i’), let us take
Let us observe that the last expression is well defined since implies . By Lemma A.6, for all such that if there exists
,
(14)
On the other hand, by Lemma A.5, there exists such that if
Given , let , there exists
In addition,
be fixed. Since such that if for all and consequently
,
from what
Then in (15), for all if
for some positive constant
, there exists
such that
from what follows that
follows that In addition, since , we can apply the dominated convergence theorem and conclude that Therefore, applying the dominated convergence theorem in (14), we have that
which entails (ii). It remains to verify that condition (iii”) holds. Since and is a Borel measure, for all ,
FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA
there exists
continuous and bounded such that . By Lemma A.1
6705
Therefore, the assumption implies that Besicovitch condition holds a.s. and then in probability. (16) APPENDIX B PROOF OF SOME TECHNICAL LEMMAS
Let A.7 with
be fixed, then applying , into three in (12), we can use the same arguments as in the part (iii”) of Theorem 4.1. Finally, since Besicovitch condition holds, taking expectation in the desired result. Proof of ition 5.1: Let Lemma A.5. Let
and
Lemma breaking parts as proof of and , we get
as given in and
the uniform kernel. For us consider the uniform kernel type estimator given by
for which we will suppose that
, let
. We write
Proof of Lemma 3.1: ) Since is a Borel probability measure, by [5, Th. 1.1, p. 7], there exists a closed set such that for any arbitrary . On the other hand, since , there exists compact and continuous of support such that from what follows that . Finally, we have that , and therefore, is tight. ) Since is a Borel probability measure, given a Borel set , by [5, Th. 1.1, p. 7], there exists a closed set such that and for any arbitrary . On the other hand, if is also tight, there exist a compact set with . Therefore, the compact set fulfills which implies that is regular with respect to the family of compact sets (a Radon measure) and therefore is -strongly regular. Proof of Lemma A.1: Let support, we have that
. By definition of
Since is a separable metric space, it contains a countable dense subset . This is, for each , there is with . Let the subset of such ’s, and for each , we define
and
By Lemma A.6
Therefore from what follows that, for all such that if
Then, for all
, we get
, there exists The right side of this inclusion is a union of countable many sets so, it remains to prove they have zero measure. For that, let us observe that for all , there is such that, since , . This implies that which has zero measure by the definition of from where the result follows. Proof of Lemma A.2: Let , we assume that If
be fixed and . Without loss of generality, for all .
is an atom we have for and for all . Therefore, the sequence satisfies the property. If is not an atom, . For fixed, we define . is well defined. In fact, first we have that for all the set is not empty since contain .
6706
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012
Second, suppose that there exists such that for all , . Therefore, since for such we get that is a contradiction. We will prove now that when . Suppose it is not true, then there exists such that for infinitely many values of . Since is the maximum, we have that infinitely often, from what follows that that is a contradiction since . Therefore, the sequence satisfies the requirements. Proof of Lemma A.3: Conditional on , we can consider as an ordered sample from the restricted to . Therefore, distribution of are i.i.d. elements from this truncated distriif bution, we can write
In particular, for
, it follows that
On the other hand, from (20), we have that
from what (19) follows. Let us consider not fixed. This case follows from the previous one as well. Since the metric space is separable, by Lemma A.1, we have
From the first case, and, in addition, Proof of Lemma A.4: Let be fixed and which is positive by definition of support. We will show that a.s. which is equivalent to show that for all
. Therefore, by the dominated convergence theorem, we have that
(17) when if
. Let us observe that
if and only
(18)
Proof of Lemma A.5: Let us take such that . If we take in Lemma A.2, the sequence , for , we can find such that from what follows the lemma.
Then, (17) is equivalent to prove that
Proof of Lemma A.6: Let (19) For
when . By hypothesis, the right-hand side of (18) goes to 0 and therefore, since , there exists such that if
and be a sequence such that arbitrary, we write
be fixed such that .
(21)
(20) On the other hand, by the strong law of large number, the lefthand side of (18) converges with probability one to
Therefore, for all
, when
, we have
Defining the i.i.d. Bernoulli random variables with parameter , using Bernstein’s Inequality in (21), we have that
(22)
FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA
Therefore, from (21) and (22), we get
6707
then
(23)
Finally, with this inequality in (24), we get
which converges to zero by Lemma A.5. Proof of Lemma A.7: For fixed, let us consider the weights given in (13) and let us define
and
. Since
are i.i.d. and
depends only on
ACKNOWLEDGMENT
, we can write
We would like to thank the referees for their valuable comments and insightful suggestions. In particular, we really appreciate the careful attention of one of them who caught some mistakes in the first version of this manuscript. We would also like to thank Ricardo Toledano and Roberto Scotto for helpful discussions REFERENCES
(24)
where in the last equality, we have used that depends on which are independent of . Now, for by Tchebychev inequality, we have
(25)
Since follows that
and
are i.i.d., and
, it
[1] C. Abraham, G. Biau, and B. Cadre, “On the kernel rule for function classification,” Ann. Inst. Statist. Math., vol. 58, no. 3, pp. 619–633, 2006. [2] A. Baíllo and A. Grané, “Local linear regression for functional predictor and scalar response,” J. Multivariate Anal., vol. 100, no. 1, pp. 102–111, 2009. [3] G. Biau, F. Bunea, and M. H. Wegkamp, “Functional classification in Hilbert spaces,” IEEE Trans. Inf. Theory, vol. 51, no. 6, pp. 2163–2172, Jun. 2005. [4] G. Biau, F. Cérou, and A. Guyader, “Rates of convergence of the functional -nearest neighbor estimate,” IEEE Trans. Inf. Theory, vol. 56, no. 4, pp. 2034–2040, Apr. 2010. [5] P. Billingsley, Convergence of Probability Measures, ser. Wiley Series in Probability and Statistics: Probability and Statistics, 2nd ed. New York: Wiley, 1999. [6] V. Bogachev, Measure Theory, Volume II. New York: Springer, 2007. [7] H. Cardot, F. Ferraty, and P. Sarda, “Spline estimators for the functional linear model,” Statist. Sinica, vol. 13, no. 3, pp. 571–591, 2003. [8] H. Cardot, A. Goia, and P. Sarda, “Testing for no effect in functional linear regression models, some computational approaches,” Commun. Statist. Simulation Comput., vol. 33, no. 1, pp. 179–199, 2004. [9] F. Cérou and A. Guyader, “Nearest neighbor classification in infinite dimension,” ESAIM Probab. Statist., vol. 10, pp. 340–355, 2006. [10] M. Cotlar and R. Cignoli, An Introduction to Functional Analysis. Amsterdam, The Netherlands: North Holland, 1974. [11] A. Cuevas, M. Febrero, and R. Fraiman, “Linear functional regression: The case of fixed design and functional response,” Can. J. Statist., vol. 30, no. 2, pp. 285–300, 2002. [12] L. Devroye, “On the almost everywhere convergence of nonparametric regression function estimates,” Ann. Statist., vol. 9, no. 6, pp. 1310–1319, 1981. [13] L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. New York: Springer-Verlag, 1996. [14] F. Ferraty and P. Vieu, Nonparametric Functional Data Analysis. Theory and Practice. New York: Springer, 2006. [15] P. Hall and J. L. Horowitz, “Methodology and convergence rates for functional linear regression,” Ann. Statist., vol. 35, no. 1, pp. 70–91, 2007.
6708
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012
[16] P. Mattila, “Differentiation of measures on uniform spaces,” in Measure Theory Oberwolfach 1979, ser. Lecture Notes in Mathematics. Berlin, Germany: Springer, 1980, vol. 794, pp. 261–283. [17] C. Preda and G. Saporta, “Clusterwise PLS regression on a stochastic process,” Comput. Statist. Data Anal., vol. 49, no. 1, pp. 99–108, 2005. [18] C. Preda and G. Saporta, “PLS regression on a stochastic process,” Comput. Statist. Data Anal., vol. 48, pp. 149–158, 2005. [19] C. Preda, G. Saporta, and C. Lévéder, “PLS classification of functional data,” Comput. Statist., vol. 22, no. 2, pp. 223–235, 2007. [20] D. Preiss, “Gaussian measures and the density theorem,” Comment. Math. Univ. Carolin., vol. 22, no. 1, pp. 181–193, 1981. [21] D. Preiss, “Differentiation of measures in infinitely-dimensional spaces,” in Proc. Conf. Topology Meas. III, Part 1, 2 (Vitte/Hiddensee, 1980), 1982, pp. 201–207. [22] D. Preiss and J. Tiser, “Differentiation of measures on Hilbert spaces,” in Measure Theory Oberwolfach 1981, ser. Lecture Notes in Mathematics. : , 1982, vol. 945, pp. 194–207. [23] W. Rudin, Real and Complex Analysis, ser. Springer Series in Statistics. New York: Springer, 1997. [24] J. O. Ramsay and B. W. Silverman, Functional Data Analysis. New York: McGraw-Hill, 1987. [25] B. P. Rynne and M. A. Youngson, Linear Functional Analysis. New York: Springer, 2008.
[26] C. J. Stone, “Consistent nonparametric regression,” Ann. Statist., vol. 5, no. 4, pp. 595–645, 1977. [27] R. Toledano, “A note on the Lebesgue differentiation theorem in spaces of homogeneous type,” Real Anal., vol. 29, no. 1, pp. 335–340, 2003. [28] R. Wheeden and A. Zygmund, Measure and Integral. An Introduction to Real Analysis. Boca Raton, FL: CRC, 1977.
Liliana Forzani, biography not available at the time of publication.
Ricardo Fraiman, biography not available at the time of publication.
Pamela Llop was born in Santa Fe, Argentina, on December 11, 1982. She received the Ph.D. degree from the Facultad de Ingeniería Química, Universidad del Litoral at Santa Fe in 2011. At the moment she is an Auxiliar Professor at Facultad de Ingeniería Química, Universidad del Litoral, Santa Fe, Argentina. She is interested in functional data and nonparametric statistics.