Sensor-Centric Data Reduction for Estimation With WSNs via ...

5 downloads 0 Views 1MB Size Report
parameters (via maximum a posteriori probability estimation) for a linear-Gaussian model. Quantization of the uncensored measure- ments at the sensor nodes ...
400

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

Sensor-Centric Data Reduction for Estimation With WSNs via Censoring and Quantization Eric J. Msechu, Member, IEEE, and Georgios B. Giannakis, Fellow, IEEE

Abstract—Consider a wireless sensor network (WSN) with a fusion center (FC) deployed to estimate signal parameters from noisy sensor measurements. If the WSN has a large number of low-cost, battery-operated sensor nodes with limited transmission bandwidth, then conservation of transmission resources (power and bandwidth) is paramount. To this end, the present paper develops a novel data reduction method which requires no inter-sensor collaboration and results in only a subset of the sensor measurements transmitted to the FC. Using interval censoring as a data-reduction method, each sensor decides separately whether to censor its acquired measurements based on a rule that promotes censoring of measurements with least impact on the estimator mean-square error (MSE). Leveraging the statistical distribution of sensor data, the censoring mechanism and the received uncensored data, FC-based estimators are derived for both deterministic (via maximum likelihood estimation) and random parameters (via maximum a posteriori probability estimation) for a linear-Gaussian model. Quantization of the uncensored measurements at the sensor nodes offers an additional degree of freedom in the resource conservation versus estimator MSE reduction tradeoff. Cramér–Rao bound analysis for the different censor-estimators and censor-quantizer estimators is also provided to benchmark and facilitate MSE-based performance comparisons. Numerical simulations corroborate the analytical findings and demonstrate that the proposed censoring-estimation approach performs competitively with alternative methods, under different sensing conditions, while having lower computational complexity. Index Terms—Censoring sensors, decentralized estimation, sensor fusion, sensor selection, wireless sensor networks.

I. INTRODUCTION

T

HE surge of interest in using wireless sensor networks (WSNs) for a multitude of signal processing applications has been inspired by advances in miniaturization, storage, and computational capabilities of the constituent sensor nodes. These application areas include smart agriculture [28], industrial monitoring [13], weather forecasting [16], military Manuscript received May 21, 2011; revised August 11, 2011; accepted October 01, 2011. Date of publication November 30, 2011; date of current version December 16, 2011. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Roberto Lopez-Valcarce. This work was supported by the AFOSR MURI Grant FA9550-10-1-0567. Part of this work was presented at Twelfth International Workshop on Signal Processing for Advanced Wireless Communications, San Francisco, California, June 2011, and at the Fourteenth International Conference on Information Fusion, Chicago, Illinois, July 2011. E. J. Msechu was with the Department of Electrical and Computer Engineering, University of Minnesota. He is now with Intel Corporation, Folsom, CA 95630 USA (e-mail: [email protected]). G. B. Giannakis is with the Department of Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN 55108 USA (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSP.2011.2171686

surveillance [3], disaster response [12], and health monitoring applications [21]. As the computation and storage capabilities of sensor nodes improve, more and more of the signal processing tasks are undertaken at the remote sensors either for fault-tolerance, or, in order to reduce the amount of data transmitted from sensor nodes so as to conserve the network’s stringent power and bandwidth resources. This work focuses on FC-based estimation of a signal (e.g., temperature) from noisy measurements at remote sensor nodes, where the additive noise is Gaussian distributed and the signal is modeled as a linear regression function. The WSN envisioned is made up of battery-operated sensor nodes with limited power, limited radio communication range, and limited transmission bandwidth. In order to exploit both the local processing and storage capabilities at the sensor nodes while prolonging battery lifetimes as well as utilizing efficiently transmission resources, sensor data reduction methods are well-motivated. Furthermore, if the WSN has a large number of sensor nodes, it is likely that a desired estimation performance may be achieved by using a fraction of the sensor data obtained from a novel data reduction technique. Sensor-based low computational complexity data reduction methods will be explored. Then using reduced data sent from the sensors, the FC can estimate the desired signal parameter. FC-based estimators optimally incorporate knowledge of the data reduction technique in order to achieve estimator MSE performance that is competitive with alternative data reduction methods. Quantization of the measurements that survive censoring will be performed to further reduce the data transmitted from the sensors to the FC. Quantization offers an added degree of freedom in the tradeoff between data rate and estimation MSE. There is a multitude of data-reduction techniques used in estimation with WSNs such as collaborative data covariance-based dimensionality reduction using convex optimization [26], [29], use of information-based criteria [11], [15], and measurement quantization [7], [25]. Some of these works propose protocols that either require collaboration among sensors in the data-reduction step, or, require several rounds of sensor-FC communication to effect data-reduction. The proposed censoring approach in this work requires no inter-sensor communications and minimal feedback from the FC. Furthermore, unlike quantization for estimation where sensors communicate rate-reduced versions of all sensed measurements, the proposed censor-quantizer effects aggressive data reduction by quantizing only the uncensored measurements. The main contributions of this work include i) data-reduction via measurement censoring that is amenable to distributed WSN implementation as an alternative to sensor selection;

1053-587X/$26.00 © 2011 IEEE

MSECHU AND GIANNAKIS: SENSOR-CENTRIC DATA REDUCTION FOR ESTIMATION WITH WSNs VIA CENSORING AND QUANTIZATION

ii) development of FC-based estimators that optimally incorporate knowledge of the censored-data model for both deterministic and random signal models; and iii) estimators based on quantized-uncensored measurements along with their Cramér–Rao lower bounds (CRLBs). Section II describes the measurement model as well as the problem statement. Section III overviews maximum likelihood estimation (MLE) with full data as well as MLE with data selection. Distributed algorithms for the joint censoring-estimation tasks are the subjects of Section IV. Then quantizer-censor estimation following the maximum likelihood (ML) criterion as well as the associated CRLB analysis are presented in Section V. Sections VI–VIII deal with the corresponding maximum a posteriori (MAP) estimation for random signals. Section IX presents simulation results, and Section X wraps up the paper. Notation: Vectors (respectively, matrices) are denoted using lower (upper) case boldface letters. The probability density , function (pdf) of with parameter is represented as where denotes both the random variable as well as the and value it takes. A Gaussian pdf with mean variance is represented as and . The probability mass function (pmf) of an event is denoted as . An estimator for will be represented as . The superscript will stand for vector or matrix transposition. Positive definiteness (semi-definiteness) of a symmetric matrix is denoted as (respectively, ). II. MODEL AND PROBLEM DEFINITION sensor nodes that constitute a WSN Consider deployed to acquire noisy signal samples. A scalar measurement at the th sensor is assumed to obey the linear model (1) where is a vector of unknown parameters to be estimated, is a vector of known regressors, and denotes independent, identically distributed (i.i.d.), zero-mean Gaussian noise with variance . Vector measurements per sensor in the presence of generally colored noise can be accommodated by using noise prewhitening as follows. Let denote the measurement at , where , , and has pdf . Noise pre-whitening yields the whose entries are affected by i.i.d. noise. vector Each entry of can then be treated as a scalar measurement in (1). The pre-whitening step can also be used for scalar measurements with independent, zero-mean Gaussian noise with non. Consequently, identical variances , in which case in the sequel the focus will be on scalar sensor measurements with i.i.d. additive noise. This work has two main objectives: i) low-complexity, sensor-centric data reduction and ii) optimal FC-based estimation of based on knowledge of the data model, noise pdf, and data-reduction technique. The first objective is achieved via sensor-censoring which leads to measurements being transmitted to the FC out of possible measurements. Further data reduction via quantization of uncensored measurements

401

is also explored. The selected measurements should ideally lead to the lowest estimation MSE of any of the possible measurements. The FC is assumed to know and , which may be estimated during a training phase or deduced from knowledge of the physics of the problem. It is further assumed that sensor has knowledge of and . The second objective is to derive estimators of at the FC measurements . based on a selected subset of Both deterministic and random parameter estimation problems will be addressed. First, ML estimation for a deterministic signal parameter model is discussed. III. MAXIMUM LIKELIHOOD ESTIMATION A. MLE Based on Uncensored Data Let be a vector of deterministic parameters. Supposing are available at the FC, the MSE-optimal esall timator can be obtained using the full-data MLE. For the linear-Gaussian model the MLE is obtained by solving the linear least-squares (LS) problem [9] (2) The LS estimator

is obtained as (3)

Its performance is assessed by the covariance matrix given by (4) , The full-data estimator MSE, given by the trace will benchmark the performance of the proposed MLE based on censored data. Before delving into censoring-based estimation, the selection-based estimator of [8] will be outlined to highlight its strengths and limitations in a WSN setting. B. MLE Based on Data Selection The overhead of sending all measurements from the sensors to the FC necessary for the estimator in (3) can be relieved by selecting measurements. In principle, given the selected measurements, a selection-based MLE (sMLE) is given by (5) where censored data cator vector entries obeying the constraint if if

and selection indihave nonzero ; that is selected not selected.

measurements The selection process aims to choose the with the largest reduction on the resultant estimator MSE. Starting with the linear-Gaussian model, the selection indicators are obtained to approximately minimize , subject to [8]. Since the cost function for measurement selection depends neither

402

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

on nor on , the optimal can be found offline prior to acquiring the measurements. This selection method was inspired by a method in statistics known as A-optimal scheme, which is applied to the design of optimal experiments (DOE) [23]. The algorithm reported in [8] for centralized data selection incurs computational complexity . If nonlinear regression models were considered, typically a linear approximation would be needed in order to use the centralized sensor selection method of [8] since the MSE matrix would either be measurement-dependent or dependent on the unknown . A decentralized measurement reduction method will be introduced in the next section via measurement censoring which has overall complexity that scales as , or per sensor. Development of a novel ML estimator based on censored data will be presented in Section IV-B.

corresponding to (7b) Using a Lagrange multiplier leads to the quadratic optimization problem

IV. ML ESTIMATION WITH DATA CENSORING

(8) Problem (8) is decomposable into the th one attains its minimum at

where denotes the optimum multiplier. Slicing leads to the selection , where denotes the indicator function. The condition can be re-cast as when and have the same sign. The per sensor measurement censoring rule is then summarized as if

A. Data Censoring

sub-problems for which

(9)

otherwise.

denoting the censoring interval, conWith sider the following censored counterpart of if otherwise.

(6)

Only uncensored measurements, i.e., , are transmitted to the FC. Censoring as a low-complexity method for data-reduction has been used in FC-based distributed detection [1], [24], where instead of censoring , the censored data are local log-likelihood ratios. Finding analogous censoring rules of the form (6) for decentralized estimation is, therefore well-motivated. Data censoring can be further motivated as an approximate solution to the LS fitting problem. Suppose temporarily that is known and the goal is to select at most regressors from for which fit the measurements best in the selected the LS sense. This leads to a constrained binary LS problem (7a)

(7b) where comprises a vector of binary {0, 1} selection variables. Two difficulties to solving (7) are evident: i) for large , integer programming solvers (e.g., branch and bound [4]) are known to incur prohibitive computational burden and ii) if in addition to , vector is also unknown, the cost in (7a) is a bilinear function of the unknowns, which renders the resultant optimization problem nonconvex and impossible to solve for the global minimum in a computationally efficient manner. These two problems can be overcome by i) replacing the integer constraints {0, 1} with interval ones [0, 1], an approximate but convex alternative to (7) can be formed for which efficient solvers are available even for large values and ii) calculating as in (2) using uniformly an initial LS estimate picked measurements from a small set of sensors in place of the unknown .

Supposing that the optimum and an initial estimate are available at each sensor, censoring can be implemented autonomously at each sensor by using (9). What remains to specify the censoring rule is to find which satisfies the inequality constraint (7b) at least on average. To this end, note that the number of uncensored , is a random variable. Bounding measurements, not to exceed the desired value , yields (10)

where is the complementary cumulative distribution function (ccdf) of the Gaussian pdf. It can be shown that is a monotonically increasing function of . It thus follows that upon replacing by , a one-dimensional grid search readily yields the wanted . For notational brevity, will henceforth be used in place of , and in place of . Remark 1: Consider the following example which highlights the influence of the regression function on the censoring rule (9). For sensor , let , , , and consider two regression function cases: (I) and (II) . The corresponding values , and (II) , respecare obtained as (I) tively. As depicted in Fig. 14, measurements with Case I regression function have a larger probability of being censored, , than those of Case II. That is, the censoring rule retains measurements that have a large absolute value of the regression function. These retained measurements indirectly correspond to those with a larger reduction of estimation MSE. Remark 2: Regardless of how the censoring intervals are chosen, data reduction in the censoring approach does not rely

MSECHU AND GIANNAKIS: SENSOR-CENTRIC DATA REDUCTION FOR ESTIMATION WITH WSNs VIA CENSORING AND QUANTIZATION

403

on the linearity of the regression function in (1). Consequently, the data selection rule in (9) readily extends to measurements , with known nonlinear functions of the form . In contrast, DOE-based selection critically relies on the linearity of the regression function during the selection step so that the data selection problem is solvable offline at the FC prior to sensors acquiring measurements. B. MLE With Censored Data In order to find the MLE with censored data, the joint pdf of in (9) has to be found. Clearly, each measurement at has pdf . Since is a 0–1 random variable, the joint censored data pdf is given as described in the following lemma. Lemma 1: With in (1) modeled as deterministic, the censored data are independently distributed as with . Proof: Since the data are independently distributed and the censoring is performed separately per sensor , it follows readily that the censored data are independently distributed as well. The probability of a measurement being censored is obtained by using the Gaussian ccdf as

Fig. 1. Data transmission from WSN to FC with censoring.

, based on the log-concavity of the Gaussian pdf surements and the log-concavity of the Gaussian ccdf. The proof is given in Appendix A. , the log-likelihood Proposition 1: If in (13) is strictly concave. The gradient , and the Hessian are obtained as (15a)

(15b) (11) Fig. 4 illustrates an example of the pdf for , , and . The reduced support of the data pdf is what leads to transmission power savings. From Lemma 1, the joint censored-data pdf is given by

Taking the log of the joint pdf, using (11) and (12), it follows that

(12) , and

where and are defined in Appendix B. The Fisher information matrix (FIM) is given as , where is defined in Appendix C. Using the FIM, the CRLB can be found as follows. Lemma 2: For any unbiased estimator of the parameter vector based on distributed according to the pdf , it holds that . The expectation is taken with respect to the censored data pdf . The CRLB will be used in the simulations of Section IX as a benchmark to assess how far the variance of the (asymptotically) unbiased cMLE is from the theoretical bound. C. cMLE Algorithms With WSNs

(13) are uncensored, Remark 3: If all the measurements reduces to the LS cost in (2). On the other hand, if all measurements are censored, reduces to the log-pdf of binary quantized data [25] with all identical binary symbols. Both of these extreme cases have low probability of occurring for large. The censored MLE (cMLE) is obtained as [cf. (13)] (14) The ensuing proposition gives conditions that guarantee a unique solution to the maximization (14) for a given set of mea-

Algorithms for the censoring and estimation steps that are amenable to WSN implementation are detailed in this section. Only transmissions between the FC and the sensor nodes are considered; that is, with no inter-sensor transmissions, leading to lower communication costs and reduction of overhead of the overall transmission protocol. As a result, the communication cost is quantified by the number of sensor-to-FC transmissions. A round-robin, slotted-time sensor schedule is envisioned such that sensor transmits at the th time slot . If censors its measurement, transmits next. More elaborate sensor scheduling protocols are also possible; see e.g., [2]. Fig. 1 illustrates the sensor-FC communication set-up as well as the slotted-time transmission with measurement censoring. It is further posited that sensor-FC transmissions are practically error-free—a condition certainly satisfied using sufficiently powerful channel coding schemes. The computation and communication steps that constitute censoring are tabulated as Algorithm 1. Prior to censoring, the

404

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

FC calculates the threshold parameter and broadcasts it to all sensors. Since censoring is performed individually per sensor [cf. (9)], each sensor autonomously decides whether or not to censor its acquired measurement. This decentralized approach is in contrast to the DOE-based selection method in [8], where the selection indexes are obtained from solving a centralized convex optimization problem. Furthermore, each sensor needs to acquire its measurement to be used in the censoring step, but only of them are transmitted to the FC. In contrast, sensors acquiring the DOE-based selection leads to only measurements. Censoring is, consequently, attractive for applications where the resource costs for sensing are outweighed by costs associated with transmission to the FC [16, Table 2].

Algorithm 2 summarizes a damped-Newton iteration for implementing the maximization in (14), details of which can be found in standard optimization texts such as [5]. The cost associated with storage and computations for Newton’s iterations . Cholesky is dominated by the storage and inversion of Hessian matrix scales as . By factorization of the comparison, efficient interior point implementation for solving the DOE-based sensor selection scales as [8]. Since , the proposed data censoring approach incurs lower computational cost compared to the sensor selection approach.

Algorithm 1 Censoring (cML) ,

Require: FC knows Initialization:

,

;

knows

FC: Polls sensors for , calculates FC: Finds from (10) using a grid search FC: Broadcasts Sensor Censoring: do for : Receives

, using (3)

: Censors to get using (9) then : Transmit to FC else : Stays idle end if end for if

Algorithm 2 Censored ML Estimation (cML) Require: Gradient tolerance ; maximum iterations Data Reception:

FC: Initialize repeat

FC: Find

until FC: Set

, and

from

using Armijo’s line search method [4]

FC: FC: or

. Uncensored meawhere surements, distributed according to (16), form the input to the quantizer. With a known pdf and using the MSE as distortion metric, an -level scalar quantizer can be designed for the uncensored using the Lloyd-Max algorithm [17]. Since quantizer design requires the unknown parameter [cf. (16)], the initial estimate will be used instead. The quantizer codewords are indexed by , where is the cardinality of the codebook. Quantization approaches tailored for estimation can be found in [6], where approximations needed for quantization of noisy signals are detailed. Using the square-error metric , the average MSE distortion, whose minimization forms the basis for encoder and decoder designs, is given by

if no data received at time

,

FC: Calculate , (13), (15a), and (15b)

By quantizing the uncensored measurements, further transmission power and bandwidth savings can be effected since only a limited number of bits describing the uncensored measurements need to be transmitted to the FC. Incorporating the benefits of quantization to the censored data for ML estimation is the is subject of this section. The conditional pdf of uncensored obtained, using Bayes rule, as (16)

, form threshold

FC: Receive , set if then FC: else FC: end if Parameter Estimation:

V. ML ESTIMATION WITH QUANTIZED-CENSORED DATA

where using the centroid condition, the decoder output is based on the encoder partition . Encoder partitions are optimized using the nearest-neighbor rule for each codeword , . For the MSE distortion criterion, globally optimal scalar quantization is provided by the Lloyd 1 algorithm [17] exponentially fast, provided that the log-pdf is concave with a finite second moment [10]. Motivated by this strong result, Lloyd’s iterative algorithm will be adapted here for determining encoder partitions. Since the quantized data are not intended to be reconstructed, but rather used for estimating , a uniquely-defined codebook corresponding to uniquely identifiable partitions is sufficient, consequently the set will be used as the codebook.

MSECHU AND GIANNAKIS: SENSOR-CENTRIC DATA REDUCTION FOR ESTIMATION WITH WSNs VIA CENSORING AND QUANTIZATION

405

and Hessian The corresponding gradient are given, respectively, by (19a)

(19b) Fig. 2. Censoring and quantization at sensor S .

where and are defined in Appendix D. The quantized-censored MLE (qcMLE) is consequently obtained as

The partition thresholds are found using Lloyd’s iterative algorithm; see also [7]. The quantization following the censoring rule (9) yields the finite-alphabet data if otherwise.

(17)

Fig. 5 depicts generic -level quantizer intervals for a truncated Gaussian pdf. In fact, for the corresponding encoder partition is , whereas for the partition is but the compact notation used in (17) will be retained and . for brevity. From Fig. 5, it follows that Following quantization, the digital sensor measurements are , , and , where and . These censored-quantized data are distributed according to the conditional pmf

The qcMLE algorithm entails the same censoring step as in Algorithm 1 with a quantization step to follow before an MLE which will now be based entirely on discrete data . The FC needs to store a table of the codebook used by each of the sensors in order to decode the received codewords . Analogous to the cMLE approach in Lemma 2, the CRLB for the quantized data pdf is given in Lemma 3. Lemma 3: If denotes any unbiased estimator for based on data from the pdf (16), then

The functional form of the FIM summarized in Appendix XI-D.

is

A. Quantized-Censored MLE Algorithms

(18) By the same logic as in the censored-data case in Lemma 1, since quantization is performed separately per sensor node, a productform pmf results. Note that for each uncensored , only one of the indicators is nonzero corresponding to the codeword from the quantization. The block diagram in Fig. 2 illustrates the two stages of censoring and quantization. It should be noted that since they are optimized separately, the obtained thresholds are not generally optimal for a joint censor-quantizer optimization metric. However, this hierarchical optimization leads to tractable optimization problems that are also amenable to distributed implementation. Since for each sensor the conditional pdf in (16) has a (and consequently a different different approximate mean truncation point ), Lloyd’s algorithm needs to be run for which . separately at each sensor Upon considering the log-pmf , it follows that [cf. (18)]

The qcMLE algorithm will have the same censoring step as in Algorithm 1 with a quantization step to follow before the MLE, which will now be based entirely on digital data . The FC needs to store a table of the codebook used by each of the sensors in order to decode the received codewords . The censor-quantizer and estimator schemes are summarized in Algorithm 3 and Algorithm 4, respectively. Algorithm 3 Censoring and Quantization (qcML) Require: FC knows Initialization:

,

,

;

knows

FC: Polls sensors for , calculates FC: Finds from (10) using a grid search FC: Broadcasts for do Censoring: : Receives , form threshold : Censor to get Quantization: if then

else : Stays idle and are specified in Appendix D. The funcis concave following the proof of Proposition 1.

end if end for

using (3)

using (9)

: Quantize using (17) to get transmitted to FC at time interval

where tion

,

, which is

406

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

Algorithm 4 Quantized-censored ML Estimation (qcML) Require: Gradient tolerance Data Reception:

; maximum iterations

FC: Receives , where time if then FC: and else FC: , for end if FC: Parameter Estimation: FC: Initialize repeat

if no data received at

B. MAP Estimation With Data Selection Motivated by the need to reduce data transmission from the sensor nodes to the FC in a Bayesian estimation setting, the data selection approach can be applied also to MAP estimation [8], with the goal of finding a selection-based MAP (sMAP) estimator

, and

,

FC: Calculate , , and FC: Use Armijo rule [4] to find FC: FC: until

OR

FC: Set VI. MAXIMUM a posteriori ESTIMATION A. MAP Estimation With Uncensored Data Consider now that uncensored data

MAP estimator (21) will benchmark performance of MAP estimators based on censored and quantized data. Furthermore, if , meaning that the prior pdf of is degenerate, the MAP estimator in (21) coincides with the MLE when is modeled as deterministic [9].

is a Gaussian random vector, with pdf . The MAP estimator of based on the is given by (20)

where

, , and . Paralleling the MLE approach, the MAP estimator (20) can be obtained as a minimizer of , which leads to a regularized LS problem

where has zero entries corresponding to measurements not selected, and denotes the measurement selection vector as in the sMLE setting of Section III-B. Following [8], entries of are selected to approximately , subject to the minimize . This sensor selection method corconstraint responds to the so-termed Bayesian A-optimal design, and its centralized implementation in [8] incurs computational complexity . It is solved offline at the FC prior to sensors acquiring any measurements, since the selection criterion does not depend on the measurements. With a given set of selected measurements sent to the FC, the sMAP estimator for the linear-Gaussian model (1) is obtained from the MAP estimator (21), (22). For nonlinear regression or non-Gaussian noise models, linearization would be needed to use the centralized sensor selection criterion of [8]. In the ensuing section a novel decentralized sensor data selection is developed by using measurement censoring whose overall complexity scales as . VII. DATA CENSORING AND MAP ESTIMATION Measurement reduction via censoring is detailed first for a random as an alternative to data selection that is both computationally efficient and decentralized. Analysis of the Bayesian MSE for the resultant censored MAP (cMAP) estimator is presented afterwards. A. Data Censoring Censoring when Using

and admits the closed-form solution (21) with estimator error covariance matrix

is random differs from that of cMLE. as prior, it follows that , where and . Letting denote the censoring threshold at , and leveraging knowledge of the data pdf, consider the following per-sensor censoring rule: if

(23)

otherwise. (22) The normalized measurements . It should be and estimator variance given by noted that the MAP estimator for the linear-Gaussian model, given that the prior is a known Gaussian pdf, coincides with the Bayesian minimum MSE estimator [9, Ch. 11]. The

have pdf

. Letting , the probability that a measurement is not censored is clearly (24)

MSECHU AND GIANNAKIS: SENSOR-CENTRIC DATA REDUCTION FOR ESTIMATION WITH WSNs VIA CENSORING AND QUANTIZATION

while the expected number of uncensored measurements is given by

407

Taking logarithm of both sides of (28) yields

If the expected number of uncensored measurements is bounded to be at most , it follows that

(29) Based on (29), the cMAP estimator (27) can thus be obtained as (30)

where denotes the fraction of uncensored measurements. Since are identically distributed, identical thresholds for , will be used, leading to the threshold , which implies that selection condition (25) is the censoring threshold for which the average number of uncensored measurements is . Censoring in (23) can alternatively be expressed as if otherwise

(26)

The following proposition, whose proof is an extension of that of Proposition 1 given in Appendix A, summarizes conditions ensuring a unique solution to the optimization problem in (30). in (29) is strictly concave if Proposition 2: Function at least one of the following conditions holds: i) ; and/or ii) . C. Censored MAP Algorithm The function

is twice differentiable with gradient and Hessian given

by (31a)

where , and . Note the difference in censor-thresholds adopted by cMAP with those used for cMLE in Section IV-A. B. MAP Estimation With Censored Data The measurements used in estimation are distributed according to the conditional pdf given by the following lemma whose proof is quite similar to that of Lemma 1, and is thus omitted. Lemma 4: For the latent data model (1) and with the censoring per sensor performed as in (26), the censored are conditionally independent given ; i.e., data . Using Bayes’ rule, it follows that , and the cMAP estimator is

(31b) where and are defined in Appendix E. The details of how the censoring and iterative estimation are implemented with an FC-based WSN are quite similar to those of the cMLE approach, except that now both FC and sensors are assumed to know the prior pdf, and the FC broadcasts the censoring threshold instead of used in the cMLE. The MAP censoring and estimation schemes are tabulated as Algorithms 5 and 6, respectively. Algorithm 5 Censoring (cMAP) Require: FC knows knows

,

,

,

(27)

Initialization:

From (1) it follows that the conditional pdf in (27) is [cf. (26)]

FC: Calculates

,

,

,

,

,

FC: Broadcasts for

do

where

: Receives , and calculates thresholds : Censors if

Using and terms in (27) can be written compactly as

to find

then : Transmit

, the product else

: Stays idle end if (28)

end for

using (26) to FC

and

408

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

Algorithm 6 Censored MAP Estimation (cMAP) Require: Gradient tolerance

; maximum iterations

Data Reception: FC: Receive time if

, where

if no data received at

then FC:

else Fig. 3. Fractional contribution factor (FCF) to FIM.

FC: end if Estimation: FC: Initialize

, draw

from

repeat FC: Calculate , (29), (31a), and (31b) FC: Find

, and

using

using Armijo’s line search rule [4]

FC: FC: until

Fig. 4. Censored data pdf.

OR

FC: Set D. Performance Analysis of the cMAP Estimator The Bayesian CRLB (BCRLB) provides a lower bound on the Bayesian MSE of an estimator [27, p. 72]. The BCRLB is based on the FIM , which is defined as

(32) Derivation of the FIM in (32), including the definition of , is detailed in Appendix F. The BCRLB for the censored-data model provides a lower bound on the MSE of any Bayesian estimator based on the censored data as summarized next. , for any estimator of Lemma 5: Given the prior pdf based on drawn from the pdf , it holds that , where is over the joint pdf . in (32) is related to the contribution of the th The scalar measurement to the “increase” of the information in the FIM expression (32). Let for , , be called the fractional contribution factor (FCF). For the censored data FIM in (32), the FCF obtained using (36) is approximately the same for . The variation of with can be from (25) into (36). deduced after substituting out of the measurements were uniformly selected If at random, it can be readily shown that the FIM would be

Fig. 5.

L-level quantizer thresholds for uncensored data pdf.

. Thus, the FCF corresponding equals . For the full-data to random selection for pdf, each measurement’s FCF is . Fig. 3 illustrates the variation of FCF with for the censored-data FIM, for the uncensored (or full)-data FIM, and for the random selection FIM given a scalar , with . It can be seen that with value of about 0.7, the censored-data FCF equals approximately that of the full-data pdf. The linear variation of the FCF for the random selection serves to highlight the advantage of using censoring over random selection. For the Bayesian A-optimal design in [8], the FCF is obtainable only numerically due to lack of a closed-form expression for the selection indexes . Since the selected sensors effect

MSECHU AND GIANNAKIS: SENSOR-CENTRIC DATA REDUCTION FOR ESTIMATION WITH WSNs VIA CENSORING AND QUANTIZATION

409

the largest reduction in MSE, A-optimal design’s FCF will generally be larger than the FCF of achieved by random selection. However, the censored data pdf accounts not only for the selected (uncensored) measurements, but also for the probability censored measurements, namely, information from the from (24), and thus it should have an FCF larger than that corresponding to the A-optimal selection. In Section IX, comparison of the MSE, which is related to the FIM via the BCRLB, will corroborate this intuition. VIII. MAP ESTIMATION WITH QUANTIZED-CENSORED DATA measurement is not censored, Bayes’ rule yields the If the as a truncated Gaussian with mean conditional pdf , and variance ; that is,

Using (24) and (25), it clearly follows that . Due , it follows to symmetry of the conditional pdf , and readily that . Unlike the qcMLE approach in (16), in the quantized censored MAP (qcMAP) case, sensor has knowledge of the truncated Gaussian pdf and does not need to invoke the estimate for the quantization is quantized using an -level Lloyd-Max step. Each quantizer yielding the finite-alphabet data , , and . The conditional pmf for the quantized data is given by

where , , and are defined as in Section V. Since the posterior pdf is , its logarithm, ignoring additive terms not dependent on , is given by

(33) where , and sponding qcMAP estimator is

. The corre-

It should be noted that the estimator will be a function of not only the censoring thresholds but also of the . Intuitively, the more finely-quanquantizer thresholds tized the uncensored measurements are (i.e., the larger is), the closer the MSE of the qcMAP would be to that of the unquantized cMAP—this will be observed in the numerical studies of Section IX.

Fig. 6.  versus MSE: fixed SNR, cMLE.

The gradient and Hessian of the objective function (33) are detailed in Appendix G. Algorithms for qcMAP cenwith soring and estimation can be obtained by replacing and with (cf. Appendix G) respectively, in Algorithms 5 and 6. The expression for the FIM of the quantized-censored pmf is also provided in Appendix G. IX. NUMERICAL TESTS A. MLE Simulations Three metrics are used to compare the performance of different estimators: i) average fraction of active sensors , obtained from averaging Monte Carlo runs; ii) the empirical , where is the true parameter vector, and is the estimate obtained from the th Monte Carlo run; and iii) the signal-to-noise ratio, . Simulations are performed for the model in (1), and compare performance of the cMLE, the A-optimal DOE-based MLE (from [8, Sec. V]), the MLE based on full data, and the CRLB. The A-optimal DOE selects the regressors that minimize the MSE of in (2). Unless otherwise stated, the regressors are picked uniformly over [ 1, 1], and kept fixed for all the Monte Carlo runs. Fig. 6 depicts the variation of MSE with for a fixed SNR is simvalue. A linear regression model of dimension ulated with scalar measurements from sensors. The A-optimal MLE has MSE performance that is indistinguishable from that of an MLE based on randomly selected measurements. The censoring-based MLE outperforms the A-optimal MLE. In fact the MSE values for cMLE almost coincide with the CRLB over the entire range of values. The improved performance of cMLE is due to the additional partial information, namely , gained from the censored measurements. Fig. 11 illustrates variation of SNR with for a fixed MSE target. It is seen that the cMLE requires fewer measurements to achieve MSE value of 1.00 0.25 than A-optimal MLE over the 30 to 22 dB SNR range. Figs. 8 and 9 compare versus MSE for a fixed SNR, using . The retwo different distributions of regressor -norms gressor norms are depicted in corresponding plots. In the first case of Fig. 8, the regressors have identical -norms which are

410

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

Fig. 7.  versus SNR: fixed MSE, cMLE.

Fig. 9.  versus MSE: heterogeneous regressors, cMLE.

Fig. 8.  versus MSE: homogeneous regressors, cMLE. Fig. 10.  versus MSE: fixed SNR, cMAP estimator.

termed homogeneous regressors. The MSE of cMLE is close to that of the full-data MLE even for low values. A-optimal MLE here exhibits slightly larger MSE than the random selection. The reason is that since homogeneous regressors have comparable effect on the estimator’s covariance matrix, there is no advantage to selecting them using an optimization criterion over picking them uniformly at random. Incidentally, the regressors in Fig. 7 were homogeneously distributed as well, hence the near-coincidence of the A-optimal MLE and random-selection MLE curves. A test case with the same settings as in Fig. 8 but with some regressors having a larger -norm than others is depicted in Fig. 9. This scenario with heterogeneous regressors is encountered, e.g., in source localization where measurements from sensors closer to the source typically achieve larger MSE reduction than those distant from the source. A-optimal selection excels in the heterogeneous regressors case since it picks measurements whose regressors yield larger reduction on the MSE. It can be seen from Fig. 9 that the MSE of cMLE and of the A-optimal MLE are indistinguishable for all values, and both are close to the MSE from the full-data MLE. In [18], the cMLE algorithm has been employed to predict hourly temperature variations based on real-world WSN data.

B. MAP Simulations Simulations for the MAP approach were also carried out for the linear model (1). Similar to the ML approach, the three metrics used for performance comparison are: i) the number of active sensors, ; ii) empirical , where is th realization of from the prior pdf, and is the corresponding estimate from the th Monte Carlo run; and iii) . The full-data MAP estimator and the BCRLB were used to benchmark the performance comparison. A random selection-based estimator, and the Bayesian A-optimal design MAP estimator were compared to the cMAP and qcMAP estimators. In Fig. 10, heterogeneous regressors were used. It is seen that the cMAP estimator outperforms estimators based on random selection and Bayesian A-optimal design. Since cMAP uses the additional knowledge from censored measurements its performance is improved without requiring extra data transmission. It can be seen that cMAP is almost coincident with the BCRLB even at low values.

MSECHU AND GIANNAKIS: SENSOR-CENTRIC DATA REDUCTION FOR ESTIMATION WITH WSNs VIA CENSORING AND QUANTIZATION

411

Fig. 14. Regressor influence on probability of censoring. Fig. 11.  versus MSE: high and low SNR cases, cMAP estimator.

(corfied portion of curves in Fig. 12. It is seen that with responding to 2-bit quantization of uncensored data) the MSE of qcMAP approaches that of the unquantized cMAP. As is the case in the purely quantizer-estimator approach of [20], there is diminishing return in MSE gains with increase in the number of quantization levels. In fact, for this particular example, the and for difference in MSE performance of qcMLE for is not significant enough to justify the increased data rate of a 6-level quantizer. X. CONCLUDING REMARKS Fig. 12.  versus MSE: fixed SNR, qcMAP estimator.

Sensor data reduction whose aim is to save communication resources for parameter estimation with wireless sensor networks was investigated in this paper. Formulated as a censoring problem, a novel sensor-centric data selection method was introduced. Iterative estimation under the maximum likelihood and maximum a posteriori probability paradigms were developed by incorporating knowledge of the censoring technique in the estimator derivation. Performance analysis based on the Cramér–Rao lower bound, and numerical studies via simulations demonstrated the potential of measurement censoring as an effective data-reduction alternative for estimation with large wireless sensor networks. Extension to quantization of uncensored measurements was also pursued. Future research will explore censoring ideas in two directions: nonlinear regression models, and real-time tracking of time-varying signals. APPENDIX

Fig. 13.  versus MSE: magnified part of Fig. 12.

A. Proof of Proposition 1 In Fig. 11 it is seen that with 6 dB, the cMAP estimator achieves the full-data MAP estimator MSE at . However, as SNR degrades so does the performance of selection-based or censoring-based MAP estimators. For 0 dB, there is an MSE gap between cMAP example, at and full-data MAP even for large values. The A-optimal design exhibits an even larger MSE gap to the full-data MAP 0 dB. at In the last simulation setting, Fig. 12 compares full-data MAP, , 4 and 6. Fig. 13 shows a magnicMAP, and qcMAP with

It can be readily proved that a doubly-differentiable function is strictly log-concave if and only of the form . This simple result will be used to show if is sufficient for log-concavity of both the that Gaussian pdf and the corresponding ccdf. corresponds to the Since negative of the exponent of the Gaussian pdf of the data in (1), is sufficient for then log-concavity of the uncensored components of the joint pdf (12). What remains is to show that this condition is sufficient

412

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

for log-concavity of the ccdf in (12) as well. The th term of the ccdf component can be written as

which simplifies to given by vector

It follows readily that the pdf is a log-concave function of as well. From [22, Theorem 2], integration of a log-concave function preserves log-concavity. The is strictly ccdf is therefore log-concave. Consequently, concave if . B. Gradient and Hessian in the cMLE algorithm The first and second derivatives of and the Hessian in (15a) and (15b), respectively, where

in Lemma 1, with the parameter

D. Gradient, Hessian, and FIM in qcMLE lead to the gradient The first and second derivatives of and the Hessian in (19a) and (19b), respectively, and defined as with

, yield the gradient

(34a)

(34b) (34c) (34d)

where , and . from the The FIM is obtained as pmf (18) following similar derivation steps as in Appendix C, where

C. The FIM in Lemma 2 Using the definitions in (34c) and (34d), the pdf in Lemma 1 yields

(35)

E. Gradient and Hessian in the cMAP Algorithm from which the FIM can be expressed as

give The first and second derivatives of the objective the gradient and the Hessian in (31a) and (31b), respectively, where

MSECHU AND GIANNAKIS: SENSOR-CENTRIC DATA REDUCTION FOR ESTIMATION WITH WSNs VIA CENSORING AND QUANTIZATION

F. The FIM in Lemma 5 The FIM

defined in (32) is found using the pdf , whereby

413

where and are defined in Appendix D, whereby now is considered random. , which is deThe FIM is obtained from the function fined as

from which where is obtained from (35) assuming is random. Then the FIM for the quantized-censored data is obtained as , which simplifies to

Numerical integration is necessary to compute the expectation [cf. Appendix F]. REFERENCES

The FIM is thus given by . Numerical integration is necessary to compute the expectation with respect to . To this end, an approximation amounting to is used. Such an approximation is justifiis approximately able in at least two cases: i) if the prior constant over the support of ; see, e.g., [14] for the justification of using a point estimate in lieu of a difficult integration in Bayesian models; and ii) if the Taylor expansion of about up to the quadratic term provides a sufficiently accurate approximation; that is,

Taking expectation of

and using

leads to (36)

G. Gradient, Hessian, and FIM for the qcMAP Estimator From the first and second derivatives of the function in (33), the gradient and Hessian are

[1] P. Addesso, S. Marano, and V. Matta, “Sequential sampling in sensor networks for detection with censoring nodes,” IEEE Trans. Signal Process., vol. 55, no. 11, pp. 5497–5505, Nov. 2007. [2] M. Ali, U. Saif, A. Dunkels, T. Voigt, K. Römer, K. Langendoen, J. Polastre, and Z. A. Uzmi, “Medium access control issues in sensor networks,” ACM SIGCOMM Comput. Commun. Rev., vol. 36, no. 2, pp. 33–36, Apr. 2006. [3] T. Arampatzis, J. Lygeros, and S. Manesis, “A survey of applications of wireless sensors and wireless sensor networks,” in Proc. IEEE Int. Symp. Intell. Control, Limassol, Cyprus, Jun. 2005, pp. 719–724. [4] D. P. Bertsekas, Nonlinear Programming. Belmont, MA: Athena Scientific, 1999. [5] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, 2004. [6] R. M. Gray, “Quantization in task-driven sensing and distributed processing,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process., Toulouse, France, May 2006, pp. 1049–1052. [7] J. A. Gubner, “Distributed estimation and quantization,” IEEE Trans. Inf. Theory, vol. 39, pp. 1456–1459, Jul. 1993. [8] S. Joshi and S. Boyd, “Sensor selection via convex optimization,” IEEE Trans. Signal Process., vol. 57, no. 2, pp. 451–462, Feb. 2009. [9] S. M. Kay, Fundamentals of Statistical Signal Processing—Estimation Theory. New York: Prentice–Hall, 1993. [10] J. Kieffer, “Exponential rate of convergence for Lloyd’s Method I,” IEEE Trans. Inf. Theory, vol. 28, no. 2, pp. 205–210, Mar. 1982. [11] A. Krause, A. Singh, and C. Guestrin, “Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies,” J. Mach. Learn. Res., vol. 9, pp. 235–284, Feb. 2008. [12] K. Lorincz, D. J. Malan, T. R. F. Fulford-Jones, A. Nawoj, A. Clavel, V. Shnayder, G. Mainland, M. Welsh, and S. Moulton, “Sensor networks for emergency response: Challenges and opportunities,” IEEE Pervas. Comput., vol. 3, no. 4, pp. 16–23, Jul. 2004. [13] J. P. Lynch and K. J. Loh, “A summary review of wireless sensors and sensor networks for structural health monitoring,” Shock Vibrat. Dig., vol. 38, no. 2, pp. 91–130, Mar. 2006. [14] D. J. C. MacKay, “Bayesian interpolation,” Neural Comput., vol. 4, pp. 415–447, May 1992. [15] D. J. C. MacKay, “Information-based objective functions for active data selection,” Neural Comput., vol. 4, pp. 590–604, Jul. 1992.

414

IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 60, NO. 1, JANUARY 2012

[16] A. Mainwaring, D. Culler, J. Polastre, R. Szewczyk, and J. Anderson, “Wireless sensor networks for habitat monitoring,” in Proc. 1st ACM Int. Workshop Wireless Sensor Netw. Appl., New York, Sep. 2002, pp. 88–97. [17] J. Max, “Quantizing for minimum distortion,” IRE Trans. Inf. Theory, vol. 6, pp. 7–12, Mar. 1960. [18] E. J. Msechu and G. B. Giannakis, “Distributed measurement censoring for estimation with wireless sensor networks,” in Proc. 12th Int. Workshop Signal Process. Adv. Wireless Commun., San Francisco, CA, Jun. 2011, pp. 216–220. [19] E. J. Msechu and G. B. Giannakis, “Decentralized data selection for MAP estimation: A censoring and quantization approach,” in Proc. 14th Int. Conf. Inf. Fusion, Chicago, IL, Jul. 2011, pp. 131–138. [20] E. J. Msechu, S. I. Roumeliotis, A. Ribeiro, and G. B. Giannakis, “Decentralized quantized Kalman filtering with scalable communication cost,” IEEE Trans. Signal Process., vol. 56, no. 8, pp. 3727–3741, Aug. 2008. [21] C. Otto, A. Milenkovia, C. Sanders, and E. Jovanov, “System architecture of a wireless body area sensor network for ubiquitous health monitoring,” J. Mobile Multimedia, vol. 1, no. 4, pp. 307–326, Apr. 2006, Rinton Press, Los Alamitos, CA. [22] A. Prékopa, “Logarithmic concave measures and related topics,” Stoch. Programm., pp. 63–82, 1980. [23] F. Pukelsheim, Optimal Design of Experiments. Philadelphia, PA: SIAM, 2006. [24] C. Rago, P. Willett, and Y. Bar-Shalom, “Censoring sensors: A lowcommunication-rate scheme for distributed detection,” IEEE Trans. Aerosp. Electron. Syst., vol. 32, no. 2, pp. 554–568, Apr. 1996. [25] A. Ribeiro and G. B. Giannakis, “Bandwidth-constrained distributed estimation for wireless sensor networks—Part I: Gaussian case,” IEEE Trans. Signal Processing, vol. 54, no. 3, pp. 1131–1143, Mar. 2006. [26] I. D. Schizas, G. B. Giannakis, and Z.-Q. Luo, “Distributed estimation using reduced-dimensionality sensor observations,” IEEE Trans. Signal Process., vol. 55, no. 8, pp. 4284–4299, Aug. 2007. [27] H. L. Van Trees, Detection, Estimation, and Modulation Theory, Part I. New York: Wiley, 1968. [28] T. Wark, P. Corke, P. Sikka, L. Klingbeil, Y. Guo, C. Crossman, P. Valencia, D. Swain, and G. Bishop-Hurley, “Transforming agriculture through pervasive wireless sensor networks,” IEEE Pervas. Comput., vol. 6, no. 3, pp. 50–57, Mar. 2007. [29] Y. Zhu, E. Song, J. Zhou, and Z. You, “Optimal dimensionality reduction of sensor data in multisensor estimation fusion,” IEEE Trans. Signal Process., vol. 53, no. 5, pp. 1631–1639, May 2005.

Eric J. Msechu (M’11) received the B.Sc. (Hons.) degree in electrical engineering from the University of Dar es Salaam, Tanzania, 1999, the M.Sc. degree in communications technology from the University of Ulm, Germany, 2003, and Ph.D. degree in electrical engineering from the University of Minnesota, Minneapolis, in 2011. He currently works as a Design Engineer for Intel Corporation, Folsom, CA. He previously worked for Nokia Mobile Phones-Germany from 2001 to 2003 and for Millicom International Cellular-Tanzania from 1999 to 2001. His research has been on iterative equalization for multi-user MIMO-OFDM and -MC/CDMA wireless receivers and distributed data reduction for estimation with wireless sensor networks. Dr. Msechu is a member of ISIF and serves on the international program committee for SENSORNETS 2012. He is a reviewer for the IEEE TRANSACTIONS ON SIGNAL PROCESSING, the IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, the IEEE SIGNAL PROCESSING LETTERS, and EURASIP and IET journals on signal processing.

Georgios B. Giannakis (F’97) received the Diploma degree in electrical engineering from the National Technical University of Athens, Greece, in 1981 and the M.Sc. degree in electrical engineering, the M.Sc. degree in mathematics, and the Ph.D. degree in electrical engineering from the University of Southern California (USC) in 1983, 1986, and 1986, respectively. Since 1999, he has been a Professor with the University of Minnesota, where he now holds an ADC Chair in Wireless Telecommunications in the Electric and Computer Engineering Department and serves as Director of the Digital Technology Center. His general interests span the areas of communications, networking and statistical signal processing subjects on which he has published more than 300 journal papers, 500 conference papers, two edited books, and two research monographs. Current research focuses on compressive sensing, cognitive radios, network coding, cross-layer designs, mobile ad hoc networks, wireless sensor, power, and social networks. Dr. Giannakis is the (co-)inventor of 21 patents issued, and the (co-)recipient of seven paper awards from the IEEE Signal Processing (SP) and Communications Societies, including the G. Marconi Prize Paper Award in Wireless Communications. He also received Technical Achievement Awards from the SP Society (2000), from EURASIP (2005), a Young Faculty Teaching Award, and the G. W. Taylor Award for Distinguished Research from the University of Minnesota. He is a Fellow of EURASIP and has served the IEEE in a number of posts, including that of a Distinguished Lecturer for the IEEE-SP Society.

Suggest Documents