Consistent Nonparametric Regression for Functional ... - IEEE Xplore

1 downloads 0 Views 4MB Size Report
Consistent Nonparametric Regression for Functional. Data Under the Stone–Besicovitch Conditions. Liliana Forzani, Ricardo Fraiman, and Pamela Llop.
IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012

6697

Consistent Nonparametric Regression for Functional Data Under the Stone–Besicovitch Conditions Liliana Forzani, Ricardo Fraiman, and Pamela Llop

Abstract—In this paper, we address the problem of nonparametric regression estimation in the infinite-dimensional setting. We start by extending the Stone’s seminal result to the case of metric spaces when the probability measure of the explanatory variables is tight. Then, under slight variations on the hypotheses, we state and prove the theorem for general metric measure spaces. From this result, we derive the mean square consistency of the -NN and kernel estimators if the regression function is bounded and the Besicovitch condition holds. We also prove that, for the uniform kernel estimate, the Besicovitch condition is also necessary in consistency for almost every . order to attain Index Terms—Functional data, nonparametric regression, separable metric spaces.

I

I. INTRODUCTION

N the functional data context, the problem of infinite-dimensional linear regression has been popularized by Ramsay and Silverman [24]. Since then, many authors have worked on this topic. Among them, we refer to [2], [7], [8], [11], [15], [17], [18], and [19]. In the nonparametric framework, there exist several results about consistency of regression estimators. Most of them are for specific estimates (like -NN and kernel estimators) and in some cases are given only for classification problems. For the kernel estimator, in [14], the authors provide consistency results assuming continuity of the regression function , boundedness of the moments of the response and a condition on the distribution of the process (the so-called small ball probability assumption). In addition, if the regression function is Lipchitz, they also obtain rates of convergence of the estimator. When the response is binary (classification), Biau et al. [3] proved the weak consistency of the -NN classification rule in separable Hilbert spaces. They used a filtering method to reduce the dimension of the space and then applied the Stone’s consistency theorem in finite dimension (see [26]). On the other hand, for separable metric spaces, Cérou and Guyader [9] proved the Manuscript received November 12, 2011; revised May 10, 2012 and July 03, 2012; accepted July 12, 2012. Date of publication August 03, 2012; date of current version October 16, 2012. This work was supported by PICT2008-0921, PICT2008-0622, PI 62-309, and PIP 112-200801-0218. L. Forzani and P. Llop are with the Facultad de Ingeniería Química and Instituto de Matemática Aplicada del Litoral, Universidad Nacional del Litoral—Consejo Nacional de Investigaciones Científicas y Técnicas, S3000GLN Santa Fe, Argentina (e-mail: [email protected]; [email protected]). R. Fraiman is with the Departamento de Matemática y Ciencias, Universidad de San Andrés, 1644 Victoria, Argentina, and also with the Centro de Matemática, Universidad de la República, 11300 Montevideo, Uruguay (e-mail: [email protected]). Communicated by N. Cesa-Bianchi, Associate Editor for Pattern Recognition, Statistical Learning, and Inference. Digital Object Identifier 10.1109/TIT.2012.2209628

weak consistency of that classification rule under some regularity condition on the regression function with respect to , the probability measure of the random element . This condition is called the Besicovitch condition (1) in probability, where is the closed ball of center and radius . Abraham et al. [1] showed that the moving window classification rule is not consistent in general metric spaces and gave conditions on the space and the regression function to ensure the strong consistency of the estimator. These conditions are the existence of an increasing sequence of totally bounded subsets such that , a condition on the bandwidth related to the covering numbers , and the Besicovitch condition. For general regression functions, recently Biau et al. [4] have presented counterexamples showing that the -NN estimator is not consistent in general metric spaces. However, they found rates of mean square convergence for Lipchitz regression functions in separable Banach spaces under somewhat restrictive conditions. More precisely, they require that the support of the underlying distribution be totally bounded, and that the smallest radius such that there exist open balls of this radius , be finite. As mentioned by the covering the set authors, in infinite-dimensional spaces closed balls are not tomost of the time. However, tally bounded so that they provide a nice set of examples where the condition holds is infor some family of distributions like the case where cluded and bounded in the space of times differentiable functions with bounded derivatives. As we have seen in the previous references, the Besicovitch condition is at the heart of most consistency proofs for classical nonparametric estimates (like -NN and kernel based estimates). It clearly combines conditions on the regression function and the underlying distribution of the explanatory variables. If is continuous, then nothing is required to , while if is not continuous some requirement on (like homogeneity condition) is necessary in order for the Besicovitch condition to hold (see [16], [22], and [27]). In finite dimension, it holds automatically for any integrable function , since it is just the Differentiation Theorem with respect to a finite measure (see, for instance, [28p. 189]). Although the result is no longer true in infinite-dimensional spaces, it holds in a general setting if the function is continuous. Several authors have attempted to generalize this theorem to general metric spaces obtaining important results (see, for instance, [16], [22], and [27] and the references therein).

0018-9448/$31.00 © 2012 IEEE

6698

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012

In this paper, we focus in providing consistency results under weak assumptions which are easy to understand. We start extending Stone’s seminal result (see [26]) to the case of locally compact metric spaces. Then we extend the result to the case of metric spaces equipped with a measure which is tight. Next, we state the theorem for the case when the regression function is somewhat regular, and later we present a result that, under slightly modifications on the hypothesis, holds for metric measure spaces in general. As a consequence, we derive the consistency of the -NN and kernel-based estimators when the Besicovitch condition holds and the regression function is bounded. We also prove that, for the uniform kernel estimate, the Besicovitch condition is necessary in order to attain consistency for almost every .

Theorem 3.1: Suppose the assumptions H1–H3 hold for a locally compact metric space. Let be a sequence of weights satisfying the following three conditions: i) For all

ii)

iii) There is a constant measurable function

such that, for every nonnegative with

II. SETTING AND NOTATION Let be a metric space and let be independent identically distributed (i.i.d.) random elements with the same law as the pair fulfilling the model

Then,

is mean square consistent, i.e.,

(2) where the error satisfies and . In this context, the regression function be approximated by

can

(3) where the weights are nonnegative and . To state the main result of this study, the following set of assumptions will be needed: H1 is a metric space; H2 the probability measure of is a Borel measure; H3 are i.i.d. random elements with the same distribution as the pair which satisfies model (2). Through all this study, the symbols , or without subscript will indicate that these measures are computed over all the random elements. Otherwise, the subscript will indicate the random element over which we compute them. The symbol will denote generic constants whose value may be different in each occurrence. The sequences and will be denoted by and , respectively. With a slight abuse of notation, we will write instead of to denote the metric space. III. STONE’S THEOREM EXTENSION In this section, we state Stone’s consistency theorem [26] for locally compact metric spaces. Moreover, under a slight modification of Stone’s conditions we obtain the result for general metric spaces. All proofs of this section will be given in Appendix A.

Let us observe that these hypotheses are exactly the same as the ones given in [26] for finite-dimensional spaces except that to extend the result to the infinite-dimensional setting we need to ask the space to be locally compact. However, most of the spaces of our interest are not locally compact. In particular, Riesz (1918) showed that a Banach vector space is locally compact if and only if it is finite-dimensional (see, for instance, [10]). Nevertheless, in the proof of Theorem 3.1 the locally compactness is only used to approximate in by a uniformly continuous and bounded function. Therefore, we can state the result for nonlocally compact spaces if we ask to be -strongly regular. Definition 3.1: Given a metric space and a Borel probability measure on we say that the function is -strongly regular if for every there exists uniformly continuous and bounded such that . A metric space is -strongly regular if every function is -strongly regular. With this notion of regularity, we state the following weaker result: Theorem 3.2: Let us suppose that the assumptions H1-H3 hold and let be a sequence of weights satisfying conditions (i)–(iii) of Theorem 3.1. If the regression function is -strongly regular, then

Let us observe that if is regular with respect to the family of compact sets (a Radon probability measure), then the metric space is -strongly regular since in this case, every function can be approximated by a continuous with bounded support function (and therefore also by a uniformly continuous function), see [23]. Moreover, since is a probability measure we have the following lemma.

FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA

Lemma 3.1: Let be a probability measure. Then, is -strongly regular if and only if is tight. As a consequence of this lemma, we have that, if the metric space is complete and separable, then it is -strongly regular. This is due to the fact that in a complete and separable metric space, every probability measure is tight (see [5, Th. 1.4, p. 10]). In Theorem 3.2, for specific values of the weights , hypothesis (iii) is the most difficult to check. However, in its proof, we use (iii) with to show that

6699

i) There is a sequence of nonnegative random variables a.s. such that

ii)

iii) For all , there exists such that for any bounded and continuous function fulfilling we have that Therefore, it is possible to replace (iii) by the following weaker condition easier to check in practice: (iii’) For all , there exists such that for any uniformly continuous and bounded function verifying we have

All these observations together give the following weaker version of Theorem 3.2. Theorem 3.3: Let us suppose that the assumptions H1-H3 hold. Let be a sequence of weights satisfying conditions (i) and (ii) of Theorem 3.1 and condition (iii’) above. If the regression function is -strongly regular, i.e., tight, then

then

Corollary 3.2: Under assumptions H1–H3, if is continuous and bounded on and is a sequence of weights which satisfies conditions (i’) and (ii) of Theorem 3.4, then is mean square consistent. Corollary 3.3: Let us suppose that the assumptions H1-H3 hold and . Let be a sequence of probability weights satisfying conditions (i’), (ii), and (iii”) of Theorem 3.4. If is a sequence of weights such that and, for each , for some constant , then the estimator defined by is mean square consistent. Here, means that, for each , and . IV.

Although we have replaced condition (iii) by the weaker one (iii’), this is still difficult to check for particular examples. However, if we ask the regression function to be uniformly continuous and bounded it is not required anymore as we state in the following corollary.

is mean square consistent.

-NEAREST NEIGHBOR ESTIMATE CONSISTENCY

In this section, we use Theorem 3.4 to prove the mean square consistency of the -nearest neighbor (from now on -NN) estimator given by

Corollary 3.1: Under assumptions H1–H3, if is uniformly continuous and bounded on and is a sequence of weights which satisfies conditions (i) and (ii) of Theorem 3.1, then is mean square consistent. Any square integrable function can be approximated in by a continuous and bounded function (but not necessarily uniformly continuous function since not every probability measure is tight). Nevertheless, we can extend our result to any function (and any metric space) if we replace condition (i) of Theorem 3.2 by (i’) (given below) which is stronger in two ways: instead of a constant value , we consider a sequence converging to zero and, the more important one, the sequence depends on .

is the -nearest neighbor of among . In case of ties, usually one attaches independent uniform random variables to and break those ties by comparing the values of and . We will assume that there are no ties with probability one (the general case can be handled as in [12]).

Theorem 3.4: Under assumptions H1-H3, if and is a sequence of weights satisfying the following conditions:

Theorem 4.1: Let us suppose that the assumptions H1–H3 hold with a separable metric space, and be a bounded function satisfying the Besicovitch condition (1).

with weights if otherwise.

(4)

where

6700

If

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012

is a sequence of positive real numbers such that , then the -NN estimator is mean square consistent. Remark 4.1: a) The classification rule considered by Cérou and Guyader [9] is bounded and in , therefore, their result (they also ask for the Besicovitch condition) is a consequence of Theorem 4.1. b) Any continuous and bounded function verifies the Besicovitch condition; therefore, under the hypothesis that is continuous and bounded, the -NN estimator is mean square consistent in any separable metric space.

V. KERNEL ESTIMATE CONSISTENCY In this section, we use Theorem 3.4 to prove the mean square consistency of the kernel regression estimate for functional regression given by

with weights

if if being a regular kernel, i.e., there are constants .

(5) is a kernel function for which such that

Theorem 5.1: Let us suppose that the assumptions H1–H3 hold with a separable metric space, and let be a bounded function satisfying the Besicovitch condition (1). If is a regular kernel, then there is a sequence a.s. for which the kernel estimator is mean square consistent. Remark 5.1: a) It is possible to prove the consistency of the kernel estimator for every sequence a.s. such that , where is the closed ball of center and radius . b) The classification rule considered by Abraham et al. [1] is bounded and in and therefore their result (they also ask for the Besicovitch condition) is a consequence of Theorem 5.1 (moreover, they require a stronger condition than separability). c) Any continuous and bounded function in verifies the Besicovitch condition; therefore, under the continuity and boundedness of , the kernel estimator is mean square consistent in any separable metric space for at least a sequence . Another important result that we could add is the necessity of the Besicovitch condition in order to achieve a slightly stronger result than mean square consistency of uniform kernel estimates as it is stated in the following proposition.

Proposition 5.1: Under assumptions H1–H3 with a separable metric space, if is the uniform kernel then, the Besicovitch condition (1) is necessary in order to attain consistency for almost all , i.e., for almost all . VI. EXAMPLES AND COUNTEREXAMPLES -strong regularity: Let be a metric space and a Borel probability measure on . As mentioned before, the space is -strongly regular if and only if the probability measure is tight. If is not tight, it follows easily that is not -strongly regular. An example of a measure which is not tight is given in [6, Example 7.1.6] (see also [5, Exercise 1.13]): let be a nonmeasurable subset of the interval with zero inner measure and unit outer measure. We consider with the usual metric as a metric space. Then, every Borel subset of this space has the form , where is a Borel subset in . We define a measure on by , where is Lebesgue measure (i.e., is the restriction of to ). Since the Lebesgue measure is regular, the measure is regular as well (we recall that the closed sets in are the intersections of with closed subsets of ). But it is not tight, since every compact set in the space is also compact in and hence, by construction, has Lebesgue measure zero; hence, we obtain . Examples where Besicovitch condition does not hold: 1) Preiss [21] provides an example of a Gaussian measure defined in the Hilbert space and a bounded measurable function , such that

2) Preiss [20] exhibits a Gaussian measure defined in and a bounded measurable function such that, for almost every the limit

does not exists. 3) Preiss [20] (see also [9]) provides an example of a Gaussian measure defined on a Hilbert space such that there exists a measurable set , such that

Therefore, if we take

, we have that

while is -strongly regular. Examples where Besicovitch condition holds: 1) In every locally compact metric space , the Besicovitch condition holds for every integrable function and any Borel probability measure . However, if is a locally compact

FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA

Banach space, Riesz showed that is a finite-dimensional space (see, for instance, [10]). 2) An example of a nondegenerated Gaussian measure in such that

For this choice of

for almost every , for every , , is given in Cérou and Guyader [9], due to Preiss and Tiser [22]. More precisely, for a given nonnegative sequence , such that , consider the space

and the measure (the countable product of the standard normal distribution on ). They show that if there exists such that , the Besicovitch condition holds for all .

6701

, we write

(7)

where, in the first term, we have used (6) and , and in the third term we have used condition (iii’). Since is uniformly continuous, for each there exists such that if , then for all . For that , using the boundedness of , we get

APPENDIX A PROOF OF THE MAIN RESULTS We will use the short notation from to the th sample element

,

to denote the distance .

Proof of Theorem 3.2: This proof follows the general lines of Stone’s Theorem given for the finite-dimensional case (see [13]). Let us consider the auxiliary function . Since

(8) for due to (i). Replacing in (7) we get from what part follows. In order to deal , let us observe that by the conditional independence

it will be sufficient to prove that Since

and

converges to zero.

which implies

and by Jensen’s inequality

Let and with given in condition (iii’). Since is -strongly regular for all , there exists a uniformly continuous and bounded function such that (6)

for , where in the last inequality, we use condition (ii) and the result follows taking .

6702

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012

Proof of Corollary 3.1: We replace edness of in the proof of Theorem 3.2.

by

in the bound-

Proof of Theorem 3.4: In order to get the result, we need only to redo the boundedness of (8) in the proof of Theorem 3.2 with continuous and bounded. For , the sequence given in (i’), we write

In addition, to prove Theorem 4.1 we will also need the following technical results whose proofs are given in Appendix B. Lemma A.2: For each , there is a sequence

, such that

and

. Lemma A.3: Let be a metric space, be fixed and the weights given by (4). Then, for all nonnegative bounded measurable function

Therefore, it will be sufficient to prove that and to zero. By the boundedness of , we have that

converge

which converges to zero by (i’). Now, let us consider . Since , is equal to

Lemma A.4: Let be a separable metric space, a Borel probability measure and a random sample of . If and then, a.s. Moreover a.s. whenever . Proof of Theorem 4.1: We will prove that the weights given by (4) satisfy conditions (i’), (ii), and (iii”) of Theorem 3.4. To prove (i’), let us consider fulfilling the conclusion of Lemma A.2 and let . Then

(9) Let be fixed. Since is continuous, for all , there exists such that if then . For that , there exists such that if then, . Therefore, for a given , there exists such that if , . In addition, by the boundedness of we have that then, the dominated convergence theorem implies that

Proof of Corollary 3.2: We replace Theorem 3.4.

by

in the proof of

In what follows, and will denote the distance from to its -nearest neighbor, and the closed ball of center and radius , , respectively. To prove Theorems 4.1 and 5.1, we will need the following lemma whose proof will be given in Appendix B. Lemma A.1: Let

Let

be fixed. Then

Therefore, it will be sufficient to prove that . For that, let us observe that and only if

if

then, it will be equivalent to show that

be a separable metric space. If , then

(10)

. Corollary A.1: Let be a separable metric space. Then, for almost every , for all , .

Since

, for large enough we have that from which it follows that

.

FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA

Therefore, from Hoeffding inequality and the fact that , we get

Finally, since the right-side hand of this inequality does not depend on , we can apply the dominated convergence theorem and conclude that

6703

Therefore, the proof will be completed if we show that the expectation with respect to of these three functions converges to zero. For this, let , and . Since is continuous, there exists such that if then . On the other hand, since by Lemma A.4 , for and for that , there exists such that if , . Then, converges to 0 for almost all and it is also bounded, therefore by the dominated convergence theorem it follows that . In addition, the boundedness of implies the boundedness of then, using the dominated convergence theorem again we have that

On the other hand, as Condition (ii) is trivial since condition (iii”) holds. Given that measure, for all , there exists such that have

. It remains to prove that and is a Borel continuous and bounded . From Lemma A.1, we

Finally, since

is bounded

which converge to zero if the bounded random variables converge to zero in probability. To see this, fix

Let get

. For every

be fixed, using the conditional trick again we

(11) The Besicovitch condition (1) implies that the second term converges to zero, while the convergence to zero of the first term follows from Lemma A.4.

Applying Lemma A.3 with we get

in (11),

We will use the short notation to denote the ball of center any element and radius , . To prove Theorem 5.1, we will need the following technical lemmas which are proved in Appendix B. Lemma A.5: For each of positive real numbers Lemma A.6: For each

, there exists a sequence such that , if

. then

Lemma A.7: Let be a metric space, be fixed and the weights given by (13). Then, for all nonnegative measurable function , we have (12)

6704

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012

Proof of Theorem 5.1: We take fulfilling the conclusion Lemma A.5 and denote by supp . For and for the sequence , let be the weights given by (5), and the weights defined by if if

(13)

Since is regular, there are constants that, for each

which entails that condition (i’) is fulfilled. In order to prove that weights satisfy condition (ii), we write

For

fixed, we define

such

if if Since

From this inequality, it follows that and in this case

, we have that

if and only if

Therefore, it will be sufficient prove that

which entails

To do this, we write (15)

Therefore, by Corollary 3.3, it suffices to prove that the weights satisfy conditions (i’), (ii), and (iii”) of Theorem 3.4. To . Then prove (i’), let us take

Let us observe that the last expression is well defined since implies . By Lemma A.6, for all such that if there exists

,

(14)

On the other hand, by Lemma A.5, there exists such that if

Given , let , there exists

In addition,

be fixed. Since such that if for all and consequently

,

from what

Then in (15), for all if

for some positive constant

, there exists

such that

from what follows that

follows that In addition, since , we can apply the dominated convergence theorem and conclude that Therefore, applying the dominated convergence theorem in (14), we have that

which entails (ii). It remains to verify that condition (iii”) holds. Since and is a Borel measure, for all ,

FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA

there exists

continuous and bounded such that . By Lemma A.1

6705

Therefore, the assumption implies that Besicovitch condition holds a.s. and then in probability. (16) APPENDIX B PROOF OF SOME TECHNICAL LEMMAS

Let A.7 with

be fixed, then applying , into three in (12), we can use the same arguments as in the part (iii”) of Theorem 4.1. Finally, since Besicovitch condition holds, taking expectation in the desired result. Proof of ition 5.1: Let Lemma A.5. Let

and

Lemma breaking parts as proof of and , we get

as given in and

the uniform kernel. For us consider the uniform kernel type estimator given by

for which we will suppose that

, let

. We write

Proof of Lemma 3.1: ) Since is a Borel probability measure, by [5, Th. 1.1, p. 7], there exists a closed set such that for any arbitrary . On the other hand, since , there exists compact and continuous of support such that from what follows that . Finally, we have that , and therefore, is tight. ) Since is a Borel probability measure, given a Borel set , by [5, Th. 1.1, p. 7], there exists a closed set such that and for any arbitrary . On the other hand, if is also tight, there exist a compact set with . Therefore, the compact set fulfills which implies that is regular with respect to the family of compact sets (a Radon measure) and therefore is -strongly regular. Proof of Lemma A.1: Let support, we have that

. By definition of

Since is a separable metric space, it contains a countable dense subset . This is, for each , there is with . Let the subset of such ’s, and for each , we define

and

By Lemma A.6

Therefore from what follows that, for all such that if

Then, for all

, we get

, there exists The right side of this inclusion is a union of countable many sets so, it remains to prove they have zero measure. For that, let us observe that for all , there is such that, since , . This implies that which has zero measure by the definition of from where the result follows. Proof of Lemma A.2: Let , we assume that If

be fixed and . Without loss of generality, for all .

is an atom we have for and for all . Therefore, the sequence satisfies the property. If is not an atom, . For fixed, we define . is well defined. In fact, first we have that for all the set is not empty since contain .

6706

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012

Second, suppose that there exists such that for all , . Therefore, since for such we get that is a contradiction. We will prove now that when . Suppose it is not true, then there exists such that for infinitely many values of . Since is the maximum, we have that infinitely often, from what follows that that is a contradiction since . Therefore, the sequence satisfies the requirements. Proof of Lemma A.3: Conditional on , we can consider as an ordered sample from the restricted to . Therefore, distribution of are i.i.d. elements from this truncated distriif bution, we can write

In particular, for

, it follows that

On the other hand, from (20), we have that

from what (19) follows. Let us consider not fixed. This case follows from the previous one as well. Since the metric space is separable, by Lemma A.1, we have

From the first case, and, in addition, Proof of Lemma A.4: Let be fixed and which is positive by definition of support. We will show that a.s. which is equivalent to show that for all

. Therefore, by the dominated convergence theorem, we have that

(17) when if

. Let us observe that

if and only

(18)

Proof of Lemma A.5: Let us take such that . If we take in Lemma A.2, the sequence , for , we can find such that from what follows the lemma.

Then, (17) is equivalent to prove that

Proof of Lemma A.6: Let (19) For

when . By hypothesis, the right-hand side of (18) goes to 0 and therefore, since , there exists such that if

and be a sequence such that arbitrary, we write

be fixed such that .

(21)

(20) On the other hand, by the strong law of large number, the lefthand side of (18) converges with probability one to

Therefore, for all

, when

, we have

Defining the i.i.d. Bernoulli random variables with parameter , using Bernstein’s Inequality in (21), we have that

(22)

FORZANI et al.: CONSISTENT NONPARAMETRIC REGRESSION FOR FUNCTIONAL DATA

Therefore, from (21) and (22), we get

6707

then

(23)

Finally, with this inequality in (24), we get

which converges to zero by Lemma A.5. Proof of Lemma A.7: For fixed, let us consider the weights given in (13) and let us define

and

. Since

are i.i.d. and

depends only on

ACKNOWLEDGMENT

, we can write

We would like to thank the referees for their valuable comments and insightful suggestions. In particular, we really appreciate the careful attention of one of them who caught some mistakes in the first version of this manuscript. We would also like to thank Ricardo Toledano and Roberto Scotto for helpful discussions REFERENCES

(24)

where in the last equality, we have used that depends on which are independent of . Now, for by Tchebychev inequality, we have

(25)

Since follows that

and

are i.i.d., and

, it

[1] C. Abraham, G. Biau, and B. Cadre, “On the kernel rule for function classification,” Ann. Inst. Statist. Math., vol. 58, no. 3, pp. 619–633, 2006. [2] A. Baíllo and A. Grané, “Local linear regression for functional predictor and scalar response,” J. Multivariate Anal., vol. 100, no. 1, pp. 102–111, 2009. [3] G. Biau, F. Bunea, and M. H. Wegkamp, “Functional classification in Hilbert spaces,” IEEE Trans. Inf. Theory, vol. 51, no. 6, pp. 2163–2172, Jun. 2005. [4] G. Biau, F. Cérou, and A. Guyader, “Rates of convergence of the functional -nearest neighbor estimate,” IEEE Trans. Inf. Theory, vol. 56, no. 4, pp. 2034–2040, Apr. 2010. [5] P. Billingsley, Convergence of Probability Measures, ser. Wiley Series in Probability and Statistics: Probability and Statistics, 2nd ed. New York: Wiley, 1999. [6] V. Bogachev, Measure Theory, Volume II. New York: Springer, 2007. [7] H. Cardot, F. Ferraty, and P. Sarda, “Spline estimators for the functional linear model,” Statist. Sinica, vol. 13, no. 3, pp. 571–591, 2003. [8] H. Cardot, A. Goia, and P. Sarda, “Testing for no effect in functional linear regression models, some computational approaches,” Commun. Statist. Simulation Comput., vol. 33, no. 1, pp. 179–199, 2004. [9] F. Cérou and A. Guyader, “Nearest neighbor classification in infinite dimension,” ESAIM Probab. Statist., vol. 10, pp. 340–355, 2006. [10] M. Cotlar and R. Cignoli, An Introduction to Functional Analysis. Amsterdam, The Netherlands: North Holland, 1974. [11] A. Cuevas, M. Febrero, and R. Fraiman, “Linear functional regression: The case of fixed design and functional response,” Can. J. Statist., vol. 30, no. 2, pp. 285–300, 2002. [12] L. Devroye, “On the almost everywhere convergence of nonparametric regression function estimates,” Ann. Statist., vol. 9, no. 6, pp. 1310–1319, 1981. [13] L. Devroye, L. Gyorfi, and G. Lugosi, A Probabilistic Theory of Pattern Recognition. New York: Springer-Verlag, 1996. [14] F. Ferraty and P. Vieu, Nonparametric Functional Data Analysis. Theory and Practice. New York: Springer, 2006. [15] P. Hall and J. L. Horowitz, “Methodology and convergence rates for functional linear regression,” Ann. Statist., vol. 35, no. 1, pp. 70–91, 2007.

6708

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 11, NOVEMBER 2012

[16] P. Mattila, “Differentiation of measures on uniform spaces,” in Measure Theory Oberwolfach 1979, ser. Lecture Notes in Mathematics. Berlin, Germany: Springer, 1980, vol. 794, pp. 261–283. [17] C. Preda and G. Saporta, “Clusterwise PLS regression on a stochastic process,” Comput. Statist. Data Anal., vol. 49, no. 1, pp. 99–108, 2005. [18] C. Preda and G. Saporta, “PLS regression on a stochastic process,” Comput. Statist. Data Anal., vol. 48, pp. 149–158, 2005. [19] C. Preda, G. Saporta, and C. Lévéder, “PLS classification of functional data,” Comput. Statist., vol. 22, no. 2, pp. 223–235, 2007. [20] D. Preiss, “Gaussian measures and the density theorem,” Comment. Math. Univ. Carolin., vol. 22, no. 1, pp. 181–193, 1981. [21] D. Preiss, “Differentiation of measures in infinitely-dimensional spaces,” in Proc. Conf. Topology Meas. III, Part 1, 2 (Vitte/Hiddensee, 1980), 1982, pp. 201–207. [22] D. Preiss and J. Tiser, “Differentiation of measures on Hilbert spaces,” in Measure Theory Oberwolfach 1981, ser. Lecture Notes in Mathematics. : , 1982, vol. 945, pp. 194–207. [23] W. Rudin, Real and Complex Analysis, ser. Springer Series in Statistics. New York: Springer, 1997. [24] J. O. Ramsay and B. W. Silverman, Functional Data Analysis. New York: McGraw-Hill, 1987. [25] B. P. Rynne and M. A. Youngson, Linear Functional Analysis. New York: Springer, 2008.

[26] C. J. Stone, “Consistent nonparametric regression,” Ann. Statist., vol. 5, no. 4, pp. 595–645, 1977. [27] R. Toledano, “A note on the Lebesgue differentiation theorem in spaces of homogeneous type,” Real Anal., vol. 29, no. 1, pp. 335–340, 2003. [28] R. Wheeden and A. Zygmund, Measure and Integral. An Introduction to Real Analysis. Boca Raton, FL: CRC, 1977.

Liliana Forzani, biography not available at the time of publication.

Ricardo Fraiman, biography not available at the time of publication.

Pamela Llop was born in Santa Fe, Argentina, on December 11, 1982. She received the Ph.D. degree from the Facultad de Ingeniería Química, Universidad del Litoral at Santa Fe in 2011. At the moment she is an Auxiliar Professor at Facultad de Ingeniería Química, Universidad del Litoral, Santa Fe, Argentina. She is interested in functional data and nonparametric statistics.

Suggest Documents