Postfiltering versus prefiltering for signal recovery ... - Semantic Scholar

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 49, NO. 12, DECEMBER 2003

3195

Postfiltering Versus Prefiltering for Signal Recovery From Noisy Samples Mirosław Pawlak, Member, IEEE, Ewaryst Rafajłowicz, Member, IEEE, and Adam Krzy˙zak, Senior Member, IEEE

Abstract—We consider the extension of the Whittaker–Shannon (WS) reconstruction formula to the case of signals sampled in the presence of noise and which are not necessarily band limited. Observing that in this situation the classical sampling expansion yields inconsistent reconstruction, we introduce a class of signal recovery methods with a smooth correction of the interpolation series. Two alternative data smoothing methods are examined based either on a global postfiltering or a local data presmoothing. We assess the accuracy of the methods by the global 2 error. Both band-limited and non-band-limited signals are considered. A general class of correlated noise processes is taken into account. The weak and strong rates of convergence of the algorithms are established and their relative efficiency is discussed. The influence of noise memory and its moment structure on the accuracy is thoroughly examined. Index Terms—Band-limited signals, non-band-limited signals, postfiltering, prefiltering, rate of convergence, signal recovery, strong convergence, Whittaker–Shannon (WS) sampling theorem.

I. INTRODUCTION

T

HE Whittaker–Shannon (WS) interpolation series plays a fundamental role in representing signals/images in the discrete domain. In fact, it is generally recognized as a milestone in signal processing, communication systems, as well as Fourier analysis [6], [11], [15], [16], [29], [31], [33], [35]. The result sigmay be briefly stated as follows. Consider a class of nals which are represented in the following way: (1.1) is the where is a finite number called the bandwidth and . A class of functions for which (1.1) Fourier transform of holds is often referred to as the class of band-limited functions . The WS thewhich in the sequel we shall denote as can be reconstructed from its orem says that every , by discrete values (1.2) Manuscript received September 3, 2001; revised March 1, 2003. This paper is dedicated to the memory of Sid Yakowitz. The material in this paper was presented in part at the IEEE International Symposium on Information Theory, Lausanne, Switzerland, June/July 2002. M. Pawlak is with the Department of Electrical and Computer Engineering, University of Manitoba, Winnipeg, MB R3T 5V6, Canada (e-mail: [email protected]). E. Rafajłowicz is with the Institute of Engineering Cybernetics, Wrocław University of Technology, Wrocław, Poland (e-mail: [email protected]. wroc.pl). A. Krzy˙zak is with the Department of Computer Science, Concordia University, Montreal, QC H3G 1M8, Canada (e-mail: [email protected]). Communicated by G. Lugosi, Associate Editor for Nonparametric Estimation, Classification, and Neural Networks. Digital Object Identifier 10.1109/TIT.2003.820013

provided that , where for and for . The convergence in (1.2) is uniform on any compact interval of . Formula (1.2) is also frequently referred to as the cardinal series or the WS sampling/reconstruction scheme. A number of properties and extensions of (1.2) have been given in the literature. In particular, truncation, aliasing, location (jitter), amplitude errors of the WS expansion have been examined. Furthermore, generalizations to multiple dimensions, random signals, not necessarily band-limited signals, missing data, wavelet subspaces, and irregular sampling have been proposed. We refer to [6], [11], [15], [16], [29], [31], [34], [35] for an extensive overview of the theory and applications of (1.2) and its extensions. Relatively little attention, however, has been given to statistical aspects of the sampling theorem, i.e., to the statistical analysis of (1.2) when only a finite record of noisy data is available. This issue has been mentioned a number of times in the signal processing literature, see [15], but no algorithms with established convergence properties for a signal reconstruction from sampled and noisy data were given. The first rigorous theoretical treatment of this problem has been studied in [21] and next in [22]. In [13], [20], [23]–[25], other reconstruction algorithms have been proposed and their convergence results along convergence rates have been established. In all these papers, a particular class of reconstruction algorithms has been examined and mostly band-limited signals have been taken into account. In this paper, we study the previously introduced algorithms for both band-limited and non-band-limited signals. Some further refinements of the previous methods are given. Furthermore, we assume a general class of correlated noise processes, whereas in the previous papers only white noise has been considered. Hence, our principal goal in this paper is to examine reconis samstruction schemes resembling (1.2) when a signal pled in the presence of the additive measurement noise, i.e., we observe (1.3) where is a sampling rate. we assume very general Regarding the noise process is a weakly stationary conditions, i.e., that for each and , linear process of the following form:

0018-9448/03$17.00 © 2003 IEEE

(1.4)

3196


Here is a sequence of independent and identically distributed (i.i.d.) random variables with zero mean and finite variance . We also assume that (a)

(1.5) Hence, we admit a general class of correlated noise processes being an output of a linear stable and causal filter with an infinite memory and with the impulse response sequence . A naive reconstruction algorithm would in (1.2) by the noisy observations replace yielding (1.6) as since it interThis method cannot converge to polates noise. The interpolation of noise is clearly an undesirable property for the signal recovery problem from noisy data. In fact, we show in Section II that the integrated variance of the esis unbounded. To fix this deficiency, we introduce timate two complementary methods of smoothing of the interpolation series. In the first method (denoted by ), we apply cardinal expansion (1.2) to plain data (1.3) followed by low-pass filtering. . This is Hence, this method is just a filtered version of a global smoothing method as the filtering process takes place . In the second strategy (denoted by ), for all values we first process data by a local smoothing method and then the interpolation scheme (1.2) is applied. This is a local method as the smoothing is confined to data falling into a finite window. One can say that the proposed two methods of smoothing are in an inverse relationship to each other. We smooth either before (method two) or after (method one) the interpolation. Fig. 1 illustrates these two schemes of signal reconstruction based on smooth corrections of the cardinal expansion. Clearly, there are other possible sampling-based methods for signal reconstruction [11], [15], [16], [32], [31], [35]. In particular, let us mention a recently introduced general class of interpolation/approximation schemes [32], [2], [31]. There, the accuracy of the proposed methods has been established assuming that one can have an infinite noise-free record of . In this paper, we examine a samples more realistic situation when one has a finite record of sampled data which are, moreover, contaminated by noise. Furthermore, we argue, see Section II, that the methods proposed in [32], [2], [31] are not able to converge to a true signal in the case of noisy data. One of the principal goals of this paper is to compare the proposed smoothing strategies. We assess their performance by the global integrated squared error (ISE). This is defined as follows: ISE

(b) Fig. 1. (a) Reconstruction method f^ using the interpolation series followed by low-pass filtering. (b) Reconstruction method f~ based on local data smoothing followed by the interpolation expansion.

only to the case of band-limited signals but also to a large class . Hence, we of non-band-limited signals defined on show that our algorithms can adapt to a larger class of signals than band limited. This is carried out by approximating a by a sequence of band-limnon-band-limited signal in ited functions with the bandwidth increasing to infinity, i.e., as . It is worth noting that allowing to vary our construction can be viewed from the perspective of wavelet interpolation subspaces (multiresolution theory), see, e.g., [9], [34], and references cited therein. We conclude that depending on a class of signals to be considered, a different smoothing method can be more accurate. Thus no single method uniformly outperforms the other over a large class of signals and the thorough comparison of the methods is given. The paper is organized as follows. First, a formal definition of our reconstruction algorithms (Section II) is given. The accuracy of the methods in the case of band-limited signals is studied in Section III. Section IV, in turn, extends our theory to non-band-limited signals. In Sections III and IV, we examine , MISE are the weak convergence, i.e., the errors MISE , evaluated. Section V gives corresponding results for ISE , i.e., the rates of convergence with probability one are ISE given. We observe that the strong convergence has a substantially different behavior than the weak one. Section VI presents some simulation results while Section VII discusses further extensions. All proofs are deferred to the Appendix . II. SIGNAL RECOVERY ALGORITHMS Throughout the paper, we consider signals with finite energy, . Our estimators originate from the interpolation i.e., formula (1.2) which for our future needs can be written as the orthogonal series expansion (2.1) for

and

. It is well known [11], [29] that

(1.7)

is defined analogously. The mean value of and ISE will be denoted by MISE . ISE The accuracy of the reconstruction methods are evaluated both in the MISE and ISE sense. We apply our algorithms not

defines the orthogonal and complete system in the space , provided that . Moreover, we have if if

(2.2)

PAWLAK et al.: POSTFILTERING VERSUS PREFILTERING FOR SIGNAL RECOVERY FROM NOISY SAMPLES

As we have already noted, the naive estimate defined in (1.6) is in (2.1) by the random sample obtained by replacing . This is, however, an inconsistent estimate of since it retains errors. Indeed, by (2.2) and the Parseval’s formula we have

The second term represents the approximation error resulting from the use of a finite number of data points. This term tends . Regarding the first term, let us observe that to zero as it is equal to (2.3) is Let us assume, for the time being, that the noise process and otherwise. white, i.e., that in (1.4) we have which Then the mean value of (2.3) is equal to . Hence, we have shown that tends to infinity as as Note that this result holds regardless of whether is constant or such that . This lack of consistency of whether the naive estimate calls for some kind of smoothing correction. In the method illustrated in Fig. 1(a) this is achieved by filtering all frequencies greater than . Hence, let be the out in impulse response function of the filter. Then the estimate is defined as follows: (2.4)

Here, we need to modify the estimate by assuming that as at a certain controlled rate. Since , the increase of produces a ladder of estimates subspaces of . In the limit, we are residing in the space. This is a rough idea of the mulable to reach the tiresolution approach which is an important part of the modern wavelet theory [9], [34]. is a kernel-type estimate since it can be The estimate written as (2.8) is the reproducing kernel for , where for all . In [22], a general class of i.e., kernel estimates has been studied. The kernel function can be replaced by (2.9) and with . The function where is often referred to as a sampling window and the typical choice for some integer . This class of is kernel estimates can improve the pointwise rate of convergence [22] but, as can be seen from the derivations given in this paper, they do not yield a faster rate of convergence of the global error. depicted in Fig. 1(b) we first To construct the estimate by applying a local smooth the input raw data smoothing rule defined over a certain window around the given data point. In this paper, we consider a linear smoothing method of the following form: (2.10)

or in a more explicit form (2.5) where denotes the convolution operator. A particular and important case of (2.5) corresponds to being the ideal low-pass filter with the bandwidth , i.e., let (2.6) Assuming that

3197

we obtain

This yields our first reconstruction formula for (2.7) , It is worth noting that in (2.7) one could replace by i.e., only knowledge of the upper bound for would be required. , i.e., the estimate lives in the Let us also note that . In Section III, we will show that consame space as for all , i.e., we evaluate the speed verges to as . Our asymptotic rewith which MISE as . Section IV exsults require that tends the above results to the case of non-band-limited signals.

and such that . for is the weighted moving avHence, the smoothing signal in the neighborhood of and is often called erage of are assumed to the moving average sequence. The weights and normalized be symmetrical, i.e., . The window size is a critical parameter controlling the local tradeoff between the amount of noise (quantified by the variance term) and the approximation error expressed by results in small variance (large the bias term. In fact, large amount of smoothing) but a poor fit to the signal shape. On gives reduced systematic error (bias) the other hand, small exists but yields larger variance. Thus, the optimal value of yielding a compromise between random error and systematic error. The analysis given in Sections III and IV provides criteria for selecting the optimal . We also determine the optimal in some particular cases. The simplest weight sequence example of (2.10) is the classical moving average smoothing method corresponding to uniform weights (2.11) Examples of other smoothing rules will be given in Section III. It is also worth noting that the linear smoothing (2.10) could

3198


be replaced by some nonlinear filters [26] yielding a potential improvement of the method. To get further insight into the role played by the weights let us assume for the time being that the noise process is has variance uncorrelated. Then we can observe that (2.12) should be chosen so that is considThe weights . It should also be noted that the seerably smaller than is correlated with covariance quence

for and zero otherwise. The smoothing rule in (2.10) combined with the interpolation series (1.2) yield the following reconstruction formula for : (2.13)

is an orthogonal series estimate, It should be observed that i.e., it is in the form of the interpolation series. Note that requires to specify the parameters , and also the weight . sequence in the case of band-limIn Section III, we evaluate MISE ited signals, whereas Section IV extends these results to a class of not-band-limited signals. We impose some conditions on the smoothing window size as well as the sampling period . Asmeets some reproducing suming that the weight sequence properties, i.e., it has a certain number of vanishing moments, we are able to calculate the precise rate of convergence. In [32] (see also [2], [31]), a powerful class of signal reconstruction methods from sampled, noise-free data has been proposed. These methods generalize a number of previously introduced schemes including interpolation, generalized sampling, and projection methods. In particular, the concept of prefiltering prior to sampling has been utilized. of an analog signal Hence, one observes the discrete samples of the filtered version , i.e., we have of

where is an impulse response function of the prefilter. This is followed by a convolution interpolation method yielding the following reconstruction formula: (2.14)

is a standard representation of confined to discrete points. Note that in (2.14) we need not to assume that . On the other hand, if and if with then (2.14) is again the expansion for . In [32], assuming that in (2.14), the for a large class of signals approximation properties of and have been exand various choices of the filters amined. Let us observe that the estimation method in (2.13) has and replaced a similar form as (2.14) with by the local discrete convolution between the weight sequence and noisy data . Our methods, however, as proved in this paper, can converge to the true signal, whereas the reconstruction algorithm in (2.14) does not seem to have this property for signals observed in the presence of noise. In fact, the prefiltering strategy can combat the aliasing effect but it does not have a noise diminishing property. Detailed analysis of scheme (2.14) in the presence of noise is left for future studies. Let us finally comment on the efficient implementation of and . Regarding the estimate in (2.7) estimates observe that it employs two operations, the interpolation and has low-pass filtering, as explained by (2.4). The signal , and, therefore, can the interpolation property, i.e., . The low-pass filter be represented by the samples in (2.6) can be easily realized by calculating the fast Fourier and then setting to zero all the transform (FFT) of the data coefficients corresponding to the frequencies higher than . The inversion of this result using again the FFT algorithm yields the desired filter. The complexity of this procedure is of the order . It is clear that the straightforward evaluation of (2.7) elementary operations. requires requires evaluation of the weighted The estimate . Rules of this form smoothing rule in (2.10) for can be computed by a sequence of straightforward arithmetic operations or by the FFT algorithm due to the fact that (2.10) is the convolution operation. It is shown in this paper that the should be selected as , . Hence, the parameter is of the order . complexity of implementing are computed, the sum in (2.13) of Once , where , must be evaluated. Since length , where formula (2.13) is the convolution , it can be implemented with the provided that different time points complexity . Thus, the overall complexity are used in evaluation of . Hence, the is of the order is more computationally complex than the estimate . estimate III. BAND-LIMITED SIGNALS In this section, we assume that . Moreover, only the weak convergence properties of the estimates are examined. Thus, as the performance measure we employ the MISE

is the impulse response of the reconstruction filter where then and if, [31], [32]. Note that if then moreover, we use

MISE and MISE

is defined analogously.


The error can be decomposed into variance and bias components as follows

3199

and IBIAS

IVAR

MISE

IBIAS (3.5)

where IVAR

(3.1)

IBIAS

(3.2)

and

reflects the existence of random error in The term IVAR describes the observed samples. On the other hand, IBIAS the truncation and approximation errors. Our signal recovery problem belongs to the class of nonparametric curve estimation problems studied extensively in the statistical literature, see [10], [14], and references cited therein. There, however, an unknown function is assumed to be defined on a finite interval and to satisfy some smoothness conditions having limited interpretation in signal theory. On the other hand, our estimation problem is defined on the whole real line due to the well-known fact that a band-limited signal cannot be time limited. Consequently, we must impose some conditions on the tails of the esor, equivalently, on the smoothness of its timated signal . Hence, we shall use the following asFourier transform sumption throughout the paper. Assumption 1: Let be a fixed nonnegative integer and let . Suppose that exists and is absolutely being of bounded variation on continuous with and continuous at . This condition is used in [6] for evaluating the pointwise truncation error in the cardinal expansion (1.2). Assumption 1 can and then it should read as also be considered in the case and is Lipschitz continuous at follows: , see [22] for in-depth analysis of this situation. at It is clear that Assumption 1 affects the behavior of . In fact, Assumption 1 implies (3.3) ,

for

, where

is the total variation of

The proof of Theorem 3.1 can be found in the Appendix. describes the corIt should be noted that formula (3.4) for relation nature of the noise process. It is also worth mentioning and that for the case of uncorrelated noise, i.e., when otherwise, we obtain the following exact formula for : IVAR IVAR The first two terms on the right-hand side of (3.5) describe the bound for the integrated square discretization error between and over the interval . Observe that these terms are of smaller order than the IVAR term. The last term in (3.5) describes the behavior of the truncation error. The bounds given in Theorem 3.1 reveal that there is a sampling period minimizing the error. Direct minimization of the bounds leads to the following result concerning the rate of convergence. Corollary 3.1: Let the conditions of Theorem 3.1 hold. Then for

we have MISE The constant follows:

in the above formula for

(3.6) can be specified as

(3.7) is inversely proportional to a power of the It is seen that noise variance and is independent of . On the other hand, the constant defining the bound in (3.6) can be selected as

on

. denote the Let following notation:

norm of . We will also need the Hence, the error increases with the noise variance as well as the signal bandwidth. (3.4)

The following result describes the accuracy of the estimate . Theorem 3.1: Let Assumption 1 be satisfied and let (1.5) and we have hold. Then for IVAR

Note also that for signals with fast decay, the rate in (3.6) . This rate, however, cannot be achieved can be close to by our method since it is known [11], [35], that a band-limited signal cannot decay arbitrarily fast. defined in (2.13). We Let us now turn to the estimate . need to impose some conditions on the weight sequence Hence, let (3.8)

3200


(3.9) (3.10)

To examine the accuracy of let us observe that due to , see (2.2), we have for the orthogonality of and ISE

(3.14)

(3.11)

for some positive constants , . We refer to the weights satisfying (3.8)–(3.11) as of the order . It is clear that the uniform weights for satisfy (3.8)–(3.11) with and , . Note also that generally . In fact, by (3.8) and the Cauchy–Schwarz inequality we have

Condition (3.9) is particularly important and can be interpreted as the reproducing property. In fact, suppose that for some and the data satisfy coefficients for

This decomposition holds despite the fact that random variables are dependent. Formula (3.14) yields the fol: lowing decomposition for MISE MISE (3.15)

whereas It is clear that the first term in (3.15) defines IVAR . The following theorem the last two terms define IBIAS gives bounds for each component of the preceding decomposition. Theorem 3.2: Let Assumption 1 be satisfied and let (1.5) satisfies the conhold. Suppose that the weights sequence ditions (3.8)–(3.11). Then we have IVAR

That is, the data lie exactly on a polynomial of order and . Then it is seen that for the weights satisfying (3.9) we have

and IBIAS where the constants

Hence, we have the reproducing property . We refer to [1], [14], [19], [26] for an extensive discussion of linear smoothing procedures. seems to be In most of practical applications, order of sufficient. This gives the local cubic reproduction property, i.e., we can reproduce peaks and inflection points in the data. , another interBesides the uniform weights esting second-order weight sequence is (3.12) Straightforward algebra shows that for this weight sequence we which is much larger than for the uniform have weight seweights. An example of the fourth-order quence is given by

(3.13) The analysis presented below, see also (2.12), reveals that is proportional to the factor appearing in IVAR (3.11). Hence, this defines the variance reducing factor and one would like to minimize it subject to the reproducing condition (3.9). It can be shown that the sequences in (3.12) and (3.13) and , respectively. realize this optimality criteria for

,

are given by and

The direct minimization of the sum of the bounds in Theand gives the following values orem 3.2 with respect to of these parameters as a function of :

Then using the fact that and observing that as , i.e., is of smaller order than , we can obtain the following result concerning the rate of convergence. Corollary 3.2: Let the conditions of Theorem 3.2 hold. Then selecting

and

we have MISE

(3.16)


A number of conclusions can be drawn from the result in (3.16). First, let us observe that both the optimized window size and the sampling interval depend on and . Note that is increasing with and decreasing with . On the other hand, is increasing with both and . For all weight sequences of order two, e.g., (3.12), we obtain the following rate of convergence:

MISE

RATES

OF

3201

TABLE I CONVEREGENCE FOR f^ AND f~ BAND-LIMITED SIGNALS

IN THE

CASE

OF

(3.17) IV. NON-BAND-LIMITED SIGNALS

and . with On the other hand, for the weights of order four, see (3.13), we obtain MISE

(3.18)

and . with is selected according The order of the weight sequence . In this secto number of existing derivatives of the signal , i.e., is an analytic function one could try tion to construct a class of weights for which the reproducing property (3.9) holds for arbitrary large . In this case, the following rate could be obtained: MISE

(3.19)

and with the window size of the order . Hence, one can conjecture that is an optimal rate of convergence for the estimate

for arbitrary small

within the class of the signals which fall off faster , . Nevertheless, it is a difficult task than to construct the weights satisfying (3.9) for a large value of . Moreover, it is important to notice that the constant appearing in condition (3.11) is increasing with . Theorem 3.2 reveals that the estimation error is proportional to . Hence, the benefits of using a higher order weight sequences can be drastically reduced by a larger value of . can have Comparing (3.6) with (3.16) we observe that as long as we choose the a better rate of convergence than weight sequence of a sufficiently large order, i.e., if the condiis satisfied. As we have already mentioned, in tion and . practice we use the weight sequences of orders Thus, if one uses the weight sequence of order two then the estimate exhibits a slower rate of convergence than that for the , the estimate has estimate . For the weights of order a better rate of convergence only for slowly decaying band-lim. If, however, , then the estimate ited signals with outperforms . Therefore, we can finally conclude that for a large class of band-limited signals, the estimation scheme depicted in Fig. 1(a) gives a better performance than the method using a local smoothing strategy. Table I summarizes the preand for numerceding discussion showing the MISE of ical values of the tail parameter . Regarding the estimate , the rates corresponding to the weight sequences of order and are exhibited.

When the signal is not band-limited one can apby a sequence of band-limited functions proximate in with an increasing bandwidth. This is essentially a fundamental idea of the so-called multiresolution analysis, see [9], [34] for a full account of multiresolution and wavelet theory. Hence, the for any ficardinal series in (2.1) can no longer represent nite . Instead, the sum on the right-hand side of (2.1) may be viewed as an interpolation operator which can converge to if . The theory of such operators for a large class of non-band-limited signals has been studied in [2]–[5], [7], [32], [31], and in references cited therein. in (2.7), we can choose In the context of the estimate in such a way that with a certain rate al. In lowing us to obtain a consistent estimate of , where defines a particular, we can specify and is the so-called bandwidth of an initial space resolution level of the th multiresolution approximation of . We show that for the signal reconstruction problem in of the form with the presence of noise there is an optimal . Nevertheless, in the remaining part of the paper as . Regarding the orthogonal series estimate we denote in (2.13) we need a result (see (4.4)) on the approximation accuracy of a non-band-limited signal by its cardinal series. First we need an analog of Assumption 1 concerning the be. havior of a signal at and let Assumption 2: Let for , , with some constant

.

Concerning the smoothness condition we use the following assumption. Assumption 3: Let in of order ,

with all bounded derivatives .

It is known that under Assumption 3 the Fourier transform of decays as for . Other condialternative to Assumption 3 can be tions on the decay of formulated. For instance, if for

(4.1)

, then —the space of functions and arbitrary small with continuous and bounded derivatives. Hence, both Assumption 3 and (4.1) describe the size of the tails of the Fourier transform of a non-band-limited signal. It is clear that the fact that a signal is not band-limited affects only the bias term in the decomposition of MISE. The following

3202


for a class of

rate . This rate has already been obtained in Theorem 3.1, see Corollary 3.1.

Theorem 4.1: Let Assumptions 2 and 3 be satisfied and let we have (1.5) hold. Then for

defined in (2.13). We Let us now turn to the estimate for class of non-band-limited sigwish to evaluate MISE nals satisfying Assumptions 2 and 3. In this case, the series on . Observing that the right-hand side of (2.1) is not equal to

result describes the accuracy of the estimate functions satisfying Assumptions 2 and 3.

IVAR and IBIAS

and then by Parseval’s formula we obtain, see (3.15) for some constants in the proof.

,

which are stated explicitly

if Remark 4.1: Theorem 4.1 reveals that MISE , , , , as . Hence, for , we need the following , , and . restrictions: These inequalities define the convex convergence region in the -plane. Using the above restrictions on the range of the exponents and some geometric arguments one can find the pair minimizing the sum of the bounds obtained in Theorem 4.1. This yields the following result on the rate of conver. gence for MISE

MISE

(4.3) The first term in (4.3) defines the integrated variance , whereas the last three terms describe the integrated IVAR . The last term is the error between the bias IBIAS and its cardinal expansion. Let us non-band-limited signal note that by Parseval’s formula we obtain

Corollary 4.1: Let the conditions of Theorem 4.1 hold. Then selecting

we obtain

(4.4) MISE

(4.2)

is exactly the same as in the Surprisingly, the optimized case of band-limited signals, see Corollary 3.1. The optimized depends on as well as . Note that increases slower with for larger , i.e., the smoother the signal is, the slower is the increase of with . In the extreme case of band-limited signals we can choose as a constant independent of . The following remark summarizes the behavior of the estimation error for various combinations of and . Remark 4.2: , i.e., if has light tails (almost time-limited i) For and , signals) we can choose . The latter with the corresponding rate represents the best possible rate for nonparametric estimation methods applied to compact supported functions satisfying Assumption 3, see [10], [14], and the references cited therein. ii) For

(almost-band-limited signals) one can select and , yielding the

In [3], the behavior of the difference between the discrete and continuous Fourier transforms has been studied. This is exactly the formula appearing under the integral sign in (4.4). Thus, one can deduce from the [3, proof of Theorem 5] that (4.4) is of for the class of functions satisfying Assumption 3. order See Lemma 4 in the Appendix for a summary of this important result. The following theorem summarizes the preceding discussion giving bounds for each term in (4.3). Theorem 4.2: Let Assumptions 2 and 3 be satisfied and meet conditions let (1.5) hold. Suppose that the weights . (3.8)–(3.11) with we have Then for IVAR and IBIAS (4.5)


where , , . The constants , , are defined in (3.10), Assumption 2, and in Lemma 4, respectively.

RATES

OF

3203

TABLE II CONVEREGENCE FOR f^ AND f~ IN NON-BAND-LIMITED SIGNALS

THE

CASE

OF

Minimizing the bounds given in Theorem 4.2 and recalling the discussion from Section III, see Corollary 3.2, we can obtain the following result concerning the rate of convergence. Corollary 4.2: Let the conditions of Theorem 4.2 hold. Then for (4.6) and

we have MISE

(4.7)

Let us observe that the rate obtained in Corollary 4.2 is identical with the rate given in Corollary 3.2 in the case of band-limited signals with the weight of order replaced by the smoothness index of the underlying class of functions. It is worth noting that in Theorem 4.2 it has been tacitly assumed that the is equal to the smoothness index . If one order of . The rate in has to replace in Corollary 4.2 by (4.7) is the same as in case of band-limited signals since the size of the approximation error represented by the last term in (4.5) is negligibly small compared to the other terms. In Section VI, we argue that this term can even decay exponentially fast. in (4.6) depends on as well Note also that the optimal needed for as . This should be contrasted with the optimal which depends only on , see Corollary 4.1. In the estimate , the optimal the case of twice differentiable signals and are given by sampling intervals for

and

It is interesting to observe that

that for , i.e., if has light tails (almost-time-limited . As we signals), the rate in (4.7) is of order have already mentioned, this represents the best possible rate for nonparametric estimation methods applied to compact supported functions satisfying Assumption 3, [10], [14]. For (almost-band-limited signals), we obtain the rate , see also (3.19). This rate, however, is impossible to reach since we have to use the window sequence of the infinite order. Table II summarizes the results concerning the rate of convergence for various combinations and of the parameters , , and . Only the cases are considered since as we have already noted this represents the values for used in practice. V. STRONG CONSISTENCY Thus far we have examined the weak convergence of our reconstruction algorithms. This kind of convergence guarantees that, on average, the reconstruction method is approaching a true signal. It is an important issue to verify whether such a property holds for almost every realization of the noise process. Hence, recalling the definition of the reconstruction error, see (1.7), we wish to verify whether ISE

Comparing the accuracy of with let us observe that if the satisfies then the order of the window sequence outperforms the estimate . This results from the estimate rates given in (4.2) and (4.7). In practice, however, one confines and then for sufficiently smooth the selection of to gives a better rate of convergence than signals the estimate , , and . Then the estimate . In fact, let, e.g., we obtain the following rates: MISE MISE Hence, for the class of signals satisfying Assumption 3 with we have MISE MISE . Finally, let us note

almost surely as

(5.1)

. We will use the abbreand analogously for the estimate viation a.s. in (5.1) to denote the strong consistency. Our main interest, however, is in establishing results on the strong rate of a.s. with the rate convergence. Thus, we say that ISE , , and this is denoted as ISE , a.s., if ISE

a.s.

(5.2)

Our results on the rate of convergence depend on the structure of the noise model. Thus, let us replace (1.4) by the following finite impulse noise model: (5.3)

3204


for some nonnegative finite integer . Similarly as in (1.4), let is a sequence of i.i.d. random variables with us assume that all

of MISE . In fact, denoting the optimized sampling intervals and ISE as and , respectively, we for MISE observe that

(5.4) (5.9)

in (5.3) defines the memory of the noise The parameter process and we should note that (5.3) is a more restrictive . The linear process in (5.3) condition than (1.4) where represents an example of a sequence of –dependent random variables. It is interesting to note that weak convergence established in Sections III and IV is virtually insensitive to the structure of the noise process, i.e., to its memory length and to the moment condition in (5.4). This is not the case for the strong convergence and we show that both the memory and the moment condition can drastically influence the rate of convergence. In fact, for noise processes with very long memory in the sense of having the correlation function which persists over very long time, we can have an arbitrary slow rate of convergence, see Corollary 5.4 below for a discussion of this issue. Nevertheless, under assumption (5.3) we are able to establish the strong rate of convergence being close to the rates obtained in Sections III and IV for the average error MISE. This is particularly true if higher order moments of exist. It is also worth noting that the problem of nonparametric estimaton of a regression function with –dependent noise was examined in [18]. We also refer to [27] for the problem of designing robust systems for detecting signals in the presence of -dependent noise. Let us first consider the case of band-limited signals examined in Section III. The following theorem summarizes the result and . on the strong rate of convergence for the estimates and let Assumption 1 be satTheorem 5.1: Let isfied. Suppose that (5.3) and (5.4) hold. a) Selecting (5.5) we have ISE b) Suppose that the weight sequence tions (3.8)–(3.11). Then selecting

a.s.

(5.6)

Part b) gives the strong rate of convergence for the estimate . Again, the rate in (5.8) is reduced by a factor of two compared , see (3.16). It is worth noting that the opto that for MISE in (5.7) is identical to that minitimized sampling interval , i.e., is not influenced by the tail condition mizing MISE of the noise process. On the other hand, the optimized window in (5.7) is larger than that required for MISE , width see Corollary 3.2. Corollary 5.2: The fact that the rates established in Theorem 5.1 are reduced by a factor of two is related to moment condition (5.4). This restriction can be replaced by a more general requirement of the following form: (5.10) , and a finite constant . Then arguing as in the proof some of Theorem 5.1 we can show that

with Regarding the estimate

a.s. (5.12) and

with .

Thus, in this general situation, the rate decreased by the factor , . Again the optimized sampling interval for the is not influenced by the moment condition (5.10). estimate Corollary 5.3: In a number of practical situations, condition . This includes the important (5.10) can hold for all values example of a Gaussian noise. In this case, we obtain the following rates: a.s.

ISE

(5.13)

and

with

(5.14)

ISE

and

and

with .

we have ISE

(5.11)

. we obtain

ISE

satisfies condi(5.7)

a.s.

ISE

a.s.

(5.8)

Corollary 5.1: Part a) of Theorem 5.1 is concerned with the and it reveals that the obtained rate is reduced by a estimate ) compared to the rate evaluated for the factor of two (up to , see (3.6). Also note that the optimized average error MISE sampling interval in (5.5) is smaller than that required for rate

. The Thus, the rates are reduced just by the factor strongest version of (5.10) is the case of bounded noise, i.e., when all

(5.15)

Then we obtain ISE

a.s.

(5.16)


with

and a.s.

ISE

(5.17)

3205

Theorem 5.2: Let Assumptions 2 and 3 hold. Suppose that (5.3) and (5.4) are satisfied. a) Selecting and

and

with

. . Note Hence the rates are reduced merely by the factor and are identical to those specified for the average that error. In order to get further insight into the aforementioned results let us observe, see the proof of Theorem 5.1, that we have, and in fact, obtained exponential bounds for both ISE , i.e., for every we have ISE

(5.19) we have a.s.

ISE

(5.20)

satisfies condib) Suppose that the weight sequence . Then selecting tions (3.8)–(3.11) with

(5.21)

ISE for some . It is known that exponential inequalities hold typically for bounded random variables [12], [17], [30]. For the bounded noise process our rates are reduced merely by the . The further reduction of rates is due to assumption factor is a sequence (5.10). Under this condition and the fact that of i.i.d. random variables we have

we have a.s.

ISE

(5.22)

Corollary 5.5: Analogously to Corollary 5.2, if condition (5.10) is met we obtain the following rate for the estimate : a.s.

ISE with for any positive constant , [30]. This and the Borel–Cantelli infinitely often . Thus, lemma imply that are essentially equivaone can say that the random variables for suflent to their truncation versions exists, . Thus, the ficiently large , provided that deterioration of the aforementioned rates is unavoidable. Note (corresponding to noise profinally that the case cesses with very long tails) must be treated in a completely different way and this is beyond the scope of this paper. Corollary 5.4: Thus far we have not examined the influence of the memory size on the rate of convergence. For a fixed and finite , all the rates obtained earlier are influenced by only in the asymptotic constant but not in the exponent. Precisely speaking, due to Lemma 5 of the Appendix , we should understand the result such as in (5.6) as follows: ISE

a.s.

and

Regarding the estimate

Consider, finally, the case of non-band-limited signals. The rates are similar to those obtained in Section IV with appropriate modifications discussed in Theorem 5.1 and Corollaries 5.2–5.4.

a.s.

ISE with

and

Yet for the bounded noise process, see (5.15), the following rates can be derived: a.s.

ISE

(5.18)

and, analogously, for all the aforementioned results. Hence, the reduction of the rate becomes noticeable for values of comrequires a delicate analysis parable to . The case when . In fact, involving some conditions on the rate of decay of we can expect the further deterifor an infinite sequence -decomoration of rates. We refer to [8] for the theory of . This posable processes allowing to deal with the case issue, however, is beyond the scope of this paper and is left for further research.

we get

,

with

, and a.s.

ISE with

and

VI. SIMULATION RESULTS The theory presented in the preceding sections is either asymptotic or gives bounds involving unknown constants. The

3206


(a) Fig. 2. (a) MISE versus cutting frequency is used.

W

for different values of noise variance err. (b) Optimal cutoff frequency

aim of the simulations which are summarized in this section is to evaluate the estimation accuracy and to give some hints on choosing parameters of the signal recovery algorithms when we have only a limited number of observations. In particular, we are interested in the dependence of the tuning parameters and on the noise variance and the sampling rate. The reconstruction error MISE is empirically evaluated as follows:

W

(b) versus error variance err. Signal (6.2)

on the approximation of a signal by its cardinal expansion. ) we can show that For the signal (with

Consequently

(6.1)

MISE

and an analogous definition for the estimate . In all simualtions, the expectation sign in (6.1) was obtained by averaging over 100 repetitions. A. Simulated Signals and Their Properties The following signals are considered: where

(6.2)

where

(6.3)

Note that the normalizing constants are such that

It should be observed that and it satisfies As. Hence, our results in Section III sugsumption 1 with with . For the gest that MISE we also have MISE , provided estimate , that the weights of order two are applied and . is not band limited and meets Assumption 2 The signal . Since the Fourier transform of is with therefore, Assumption 3 holds for arbitrary large . The results with of Section IV suggest that MISE , . On the other hand, MISE for the weights of order two, and if , . It has been noted in Section IV, see (4.4), that the behavior of the estimate in the case of non-band-limited signals depends

Let us observe that the 3-dB bandwidth of and that for . Therefore, for following exponential bound:

is given by we have we obtain the

Note that for the bound becomes . There is a large class of non-band-limited signals for which the term in (4.4) can decay exponentially fast. B. Estimator

-Filtering Shannon’s Series

In this subsection, the simulations for the reconstruction ) are reported. method (2.7) (with replaced by . The As a true signal we have selected (6.2) with is shown dependence of MISE on the filter cutting frequency in Fig. 2 (a). Different plots in this figure were obtained for ) in observations. varying variance of the errors (denoted as The noise process is assumed to be uniformly distributed over . Note that the noise variance the interval . As one can notice, for each errors’ variance there minimizng MISE. The dependence of the optimal exists on is shown in Fig. 2 (b). Notice that the optimal is a . decreasing function of In the above simulations, the sampling frequency was kept fixed. In Fig. 3, the sampling frequency, denoted here as ,


(a) Fig. 3. (a) MISE versus cutting frequency (6.3) is used.

W

for different sampling frequencies

3207

W . (b) Ooptimal cutting frequency W

(a)

(b) versus sampling frequency

W . Signal

(b)

Fig. 4. (a) Dependence of MISE on the range of averaging raw data M for increasing sampling intervals . (b) The dependence of M minimizing MISE on . The plots were obtained for n = 100 samples from signal (6.2) sampled with errors uniformly distributed on [ 0:3; 0:3].

0

was varying, while the errors variance was kept fixed. Contrary to the case of Fig. 2, here the observations were generated from the non-band-limited signal in (6.3). The dependence of for different sampling MISE on the cuttoff filter frequency is shown in Fig. 3 (a). Here again one can obfrequencies for each sampling serve the existence of the most suitable frequency . The dependence of this optimal cuttoff frequency on is plotted in Fig. 3 (b). The monotonicity of this dependence is worth noticing. C. Estimator

-Averaging Data for Shannon’s Series

defined in (2.13) on the simHere we test the estimator ulated data of the same type as in the previous subsection. The is used. uniform weight sequence 1) MISE Versus Averaging Interval : The width of the prefilter window is of crucial importance for the perforwhen is finite. Our aim is to evaluate the demance of pendence of MISE on , taking into account also other parameters influencing MISE. In all experiments the signal in (6.2) was used. First, consider the plots of MISE versus , obtained for different sampling rates , which are shown in Fig. 4 (a). Exisminimizing MISE is clearly visible in these plots. tence of depends on and this It is also quite clear that the optimal dependence is shown in Fig. 4 (b), which suggests that it is reaif is larger. Note that an increase of sonable to use smaller

and leads to the increase of the estimation bias, hence also MISE increases. Thus, trying to minimize MISE for growing we have to use smaller . In Fig. 5 (a), we show again the dependence of MISE on . is fixed, while in different plots This time, however, of the uniformly distributed random errors on the range is varying. For each there exists a partic, for which MISE is minimized. The reular window width and is plotted in Fig. 5 lationship between the optimal (b) by fat dots. Simultaneously, a curve is fitted to these points by the least squares. The resulting funcis shown in this figure as tion the fat continuous curve. The need for increasing the averaging window width as the amplitude of errors increases is in agreement with the intuition. 2) MISE Convergence Rate: The aim of the simulations reported in this subsection is to evaluate the dependence of the reconstruction error on the sampling rate . The estimate was evaluated for with varying in the interval . The signal was used. We modify the the reconstruction errror defined in (6.1) as follows: NMISE The normalization by is necessary since the observation interval is varying with . In Fig. 6, the dependence of NMISE

3208


(a)

(b)

0

Fig. 5. (a) Dependence of MISE on M for different values of parameter ERR of the uniformly [ ERR; ERR] distributed noise process. Lower plots correspond to smaller ERR. (b) Dependence of M minimizing MISE on ERR (fat dots) and fit of M = a + b log(ERR). n = 100 observations were simulated from signal (6.2), which was sampled with the rate = 0:2.

1

(a) Fig. 6.

(b)

(a) NMISE versus for different values of the error magnitude ERR. (b) Optimal versus ERR.

on is plotted for different values of the parameter of . The oprandom errors uniformly distributed on timal values of are easily visible. Fig. 6 (b) depicts the optimal values of as a function of . In both cases the value was used. VII. CONCLUDING REMARKS In this paper, a thorough analysis of two categories of signal reconstruction methods originating from the Shannon sampling series has been conducted. The first one is the postfiltering method, where the low-pass filter is applied to the interpolation expansion. In the second method, the filtered noisy data are used in the interpolation series. The accuracy of the algorithms in the case of band-limited and non-band-limited signals is assessed. A class of linear noise processes is taken into account and the rates of convergence both in the weak and strong sense are established. In the latter case, the effect of the moment structure and the memory length of the noise process has been thoroughly examined. This detailed analysis reveals suitability of the methods in each of the aforementioned situations. Since no single method has uniformly better performance over signals ranging from the class of band-limited to the class of non-band-limited signals, it is an interesting option to combine these two alternative techniques. One such possibility would be to use a two-channel system shown in Fig. 1 with the input and the output being a linear combination of and ,

i.e., , with being the shrinkage parameter depending on some prior information about a class of signals being estimated. Our results suggest that should be close to zero if the signal is almost band should be selected. If the signal limited, i.e., the estimate becomes less smooth (less band limited) then closer to one would be preferable. This could be further combined with the which has an ideal bias. Details nonsmoothed estimate on such combined estimates will be reported elsewhere. There are other possible extensions of the methodology proposed in this paper, e.g., to the case of multichannel sampling [19], [11], the generalized sampling series proposed in [32], see (2.14), and to multivariate signals. In addition, generalization of the obtained results to a wider class of dependent noise processes is also of great importance. APPENDIX In this section, we prove the results presented in the previous sections. For further considerations we shall need the following auxiliary results. . Then Lemma 1: Let ; a) ; b) is the th derivative of where norm.

and

is the


These inequalities are often referred to as the Bernstein’s inequalities [11], [16], [35]. Lemma 2: Let , we have

. Then for all

and

3209

ma 5.1] for an alternative way of deriving exponential bounds for –dependent random variables. We refer to [12], [17], [30] for a discussion of further generalizations of exponential inequalities. Proof of Theorem 3.1: Due to the decomposition in (3.1), term. Noting that the (3.2) let us first consider the IVAR is given by Fourier transform of

The proof of this lemma can be found in [23]. Lemma 3: Let , we have

. Then for all

and

and by Parseval’s identity we obtain (8.1)

IVAR Let us first observe that If, in addition,

and

then we obtain (8.2)

The proof of this lemma can be found in [23]. Lemma 4: Let be from the set of all functions in with derivatives of order , in . Then there exists a constant independent of and such that for

Due to the fact that we obtain

is a stationary linear process, see (1.4),

(8.3) , the absolute value of Employing again the stationarity of the second term in (8.2) is bounded by (8.4)

This results describes the approximation accuracy of nonband-limited functions by the cardinal expansion. This bound can be deduced from [3, proof of Theorem 5] concerning the behavior of the difference between the discrete and continuous Fourier transforms. Finally, we need the following fact concerning the exponential bounds for the sum of –dependent random variables.

where Let us note that

be a sequence of random Lemma 5: Let and variables such that the random vectors are independent if , where is a and let , positive integer. Let for all . Then for

This along with (8.1)–(8.4) yield

with

.

(8.5)

IVAR is given in (3.4). where As for the bias term IBIAS

(8.6) let us first observe that

a) IBIAS b) where

,

.

These inequalities are often referred to as the Hoeffding and Bennet inequalities [12], [17], [30]. In the classical formulation are of these inequalities it is assumed that independent random variables. Here we give an extension of the inequalities to the case of -dependent random variables with . This modification can be easily derived by a quick inspection of the results obtained in [12], see also [18, Lem-

(8.7)

3210


Using Lemmas 1–3 and then proceeding similarly as in [23, proof of Theorem 1] we can show that

This, Lemma 1 b) and (3.10) lead to the following bounds:

On the other hand, by Assumption 1, see (3.3), we have Since due to (3.3)

Thus, the term

the theorem follows.

in (8.7) does not exceed

Proof of Theorem 4.1: Let us first note that the fact that is not band-limited does not influence the variance term in (3.1). . Let us first start with Thus, it suffices to consider IBIAS the following decomposition:

The proof of Theorem 3.1 is thus completed. Proof of Theorem 3.2: Recalling the decomposition in . From (3.15), let us first consider the term (2.10) we obtain

IBIAS

Using the notation from the proof of Theorem 3.1 and the stawe obtain tionarity of (8.8) Let us now consider the correlation term in the formula for

where now such that By an analog of (8.7) we obtain

as

.

where and are defined in (8.7). Regarding the term , let us note that (8.9) was again employed. The above where the stationarity of facts and the Cauchy–Schwarz inequality give the following bound for the absolute value of the term in (8.9):

This along with (8.8), condition (3.11), and (8.5) yield

(8.10) Simple algebra and the Cauchy–Schwarz inequality show that

where is defined in (3.4). Let us now consider the second term in (3.15). From the reproducing property of (3.9) and by Taylor’s formula we obtain

By invoking Lemma 3 this does not exceed

for some

.

(8.11)


In turn, by Lemma 2 we have

3211

Proof of Theorem 5.1: Let us start with the following decompostion: ISE (8.16) (8.12)

Invoking (8.10)–(8.12) we obtain

First, taking into account only the critical terms in bound (3.5) we obtain (8.17) for

(8.13) , where Assumption 2 gives

, and

being a positive constant. Specifying , , one can rewrite the bound in (8.17) as follows: (8.18)

. Let us now consider

the stochasatic part of the reconstruction error ISE . Application of Parseval’s identity and the mean-value theorem yield (8.14) Due to Assumption 3, the second term in the decomposition of is equal to IBIAS

for

. This yields (8.19)

where not exceed

is the Fourier transform of

. The above does (8.15)

and where It is worth noting that we have

. and that due to (5.3), (5.4)

The theorem follows from (8.13)–(8.15). Proof of Theorem 4.2: We start with the decomposition given in (4.3) and observe that only the last three terms have to be evaluated. In fact, reproducing (3.9) and arguments similar to those used in the proof of Theorem 3.2 yield

(8.20) . where as folTo this end, let us decompose the random variable , where and lows: . Due to (8.20) it is sufficient to evaluate the probabe with replaced by bility in (8.19) only for , i.e., let . Since is a sequence of bounded random variables with and, moreover, due to (8.20) we have , one can apply Lemma 5. In fact, by (8.19) and Lemma 5 b) we obtain

Assumption 2 in turn leads to the following bound:

(8.21) Finally, Lemma 4 assures that the last term on the right- hand . Collecting all of the side of (4.3) is bounded by aforementioned bounds we can conclude the proof of the theorem.

where

3212


Using the formula for

one can bound (8.21) by (8.22)

and are some positive conwhere stants. Taking into account (8.18) and letting , we can bound (8.22) by (8.23) for some positive constant . We can choose large enough to make the above term summable. Thus, by the Borel–Cantelli lemma we have proved that a.s. be the counterpart of Let now Due to (8.20) we have that

with

(8.24) replaced by

.

infinitely often Consequently, there is a set with there exists a finite integer each we have a.s.. Hence, for all

such that for and for all

Since is finite we can conclude the proof of the part a) of Theorem 5.1. Since part b) can be verified using the aforementioned arguments and the results of Theorem 3.2. the proof of the theorem is completed. In order to establish the result in (5.11) and (5.12) we proceed in the same way as in the proof of Theorem 5.1 using the . On the truncation argument with exist we use other hand, in the case when all moments of . Finally, Theorem 5.2 can be proved by combining the above arguments and the results established in Theorems 4.1 and 4.2. ACKNOWLEDGMENT The authors wish to thank the Associate Editor and the anonymous reviewer for their valuable comments and suggestions. REFERENCES [1] T. W. Anderson, The Statistical Analysis of Time Series. New York: Wiley, 1971. [2] T. Blu and M. Unser, “Quantitative Fourier analysis of approximation techniques: Part I – Interpolations and projectors,” IEEE Trans. Signal Processing, vol. 47, pp. 2783–2795, Oct. 1999. [3] J. H. Bramble and S. R. Hilbert, “Estimation of linear functionals on Sobolev spaces with applications to Fourier transforms and spline interpolation,” SIAM J. Numer. Anal., vol. 7, pp. 112–124, Mar. 1970. [4] J. L. Brown, “On the error in reconstructing a nonband-limited function by means of bandpass sampling theorem,” J. Math. Anal. Appl., vol. 18, pp. 75–84, 1967. [5] P. L. Butzer, W. Engels, and U. Scheben, “Magnitude of the truncation error in sampling expansions of band-limited signals,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-30, pp. 906–912, Apr. 1982.

[6] P. L. Butzer and R. L. Stens, “Sampling theory for not necessarily bandlimited functions: A historical overview,” SIAM Rev., vol. 34, pp. 40–53, Mar. 1992. [7] S. Cambanis and M. K. Habib, “Finite sampling approximation for nonband-limited signals,” IEEE Trans. Inform. Theory, vol. IT-28, pp. 67–73, Jan. 1982. -decom[8] K. C. Chanda and F. H. Ruymgaart, “Curve estimation for posable time series including bilinear processes,” J. Multivariate Anal., vol. 38, pp. 148–166, 1991. [9] I. C. Daubechies, Ten Lectures on Wavelets. Philadelphia, PA: SIAM, 1992. [10] S. Efromovich, Nonparametric Curve Estimation: Methods, Theory and Applications. New York: Springer-Verlag, 1999. [11] J. R. Higgins, Sampling Theory in Fourier and Signal Analysis. Oxford, U.K.: Clarendon, 1996. [12] W. Hoeffding, “Probability inequalities for sums of bounded random variables,” J. Amer. Statist. Assoc., vol. 58, pp. 13–30, 1963. [13] A. Krzy˙zak, E. Rafajłowicz, and M. Pawlak, “Moving average algorithms for band-limited signal recovery,” IEEE Trans. Signal Processing, vol. 45, pp. 2967–2976, Dec. 1997. [14] C. Loader, Local Regression and Likelihood. New York: Springer-Verlag, 1999. [15] R. J. Marks II, Introduction to Shannon Sampling and Interpolation Theory. New York: Springer-Verlag, 1991. [16] R. J. Marks II, Ed., Advanced Topics in Shannon Sampling and Interpolation Theory. New York: Springer-Verlag, 1993. [17] C. McDiarmid, “On the method of bounded differences,” in Surveys in Combinartorics, ser. London Mathematical Society Lecture Notes. Cambridge, U.K.: Cambridge Univ. Press, 1989, vol. 141, pp. 148–188. [18] H. G. Müller and U. Stadtmüller, “Estimation of heteroscedasticity in regression analysis,” Annals Statist., vol. 15, pp. 610–625, 1987. [19] A. Papoulis, Signal Analysis. New York: McGraw-Hill, 1977. [20] M. Pawlak, A. Krzy˙zak, and E. Rafajłowicz, “Exponential weighting algorithms for reconstruction of band-limited signals,” IEEE Trans. Signal Processing, vol. 44, pp. 538–545, Mar. 1996. [21] M. Pawlak and E. Rafajłowicz, “On restoring band-limited signals,” IEEE Trans. Inform. Theory, vol. 40, pp. 1490–1503, Sept. 1994. [22] M. Pawlak and U. Stadtmüller, “Recovering band-limited signals under noise,” IEEE Trans. Inform. Theory, vol. 42, pp. 1425–1438, Sept. 1996. , “Kernel regression estimators for signal recovery,” Statist. Probab. [23] Lett., vol. 31, pp. 185–198, 1997. [24] , “Nonparametric estimation of a class of smooth functions,” J. Nonparametric Statist., vol. 8, pp. 149–183, 1997. [25] , “Statistical aspects of sampling for noisy and grouped data,” in Advances in Shannon Sampling Theory: Mathematics and Applications, J. Benedetto and P. Ferreira, Eds. Boston, MA: Birkhauser, 2001. [26] I. Pitas and A. N. Venetsanopoulos, Nonlinear Digital Filters. Boston, MA: Kluwer, 1990. [27] H. V. Poor, “Signal detection in the presence of weakly dependent noise—Part I: Optimum detection,” IEEE Trans. Inform. Theory, vol. IT-28, pp. 735–744, Sept. 1982. [28] D. Slepian, “Some comments on fourier analysis, uncertainty and modeling,” SIAM Rev., vol. 25, pp. 379–393, June 1983. [29] F. Stenger, “Approximations via Whittaker’s cardinal function,” J. Aprox. Theory, vol. 17, pp. 222–240, 1976. [30] W. F. Stout, Almost Sure Convergence. New York: Academic, 1974. [31] M. Unser, “Sampling—50 years after Shannon,” Proc. IEEE, vol. 88, pp. 569–587, Apr. 2000. [32] M. Unser and I. Daubechies, “On the approximation power of convolution-based least squares versus interpolation,” IEEE Trans. Signal Processing, vol. 45, pp. 1697–1711, July 1997. [33] P. P. Vaidyanathan, “Generalizations of the sampling theorems: Seven decades after Nyquist,” IEEE Trans. Circuits and Syst.-I, vol. 48, pp. 1094–1109, Sept. 2001. [34] G. Walter, “Wavelets and sampling,” in Advances in Shannon Sampling Theory: Mathematics and Applications, J. Benedetto and P. Ferreira, Eds. Boston, MA: Birkhauser, 2001. [35] A. I. Zayed, Advances in Shannon’s Sampling Theory. Boca Raton, FL: CRC, 1993.

m