Bootstrap Particle Filtering

16 downloads 0 Views 2MB Size Report
tance sampling/resampling (SIR) methods to perform statistical inferences yielding a suite of popular estimators such as the conditional expectation, maximum ...
[James V. Candy]

PROF. DR. KARL HEINRICH HOFMANN

[Performance of a passive synthetic

aperture in an uncertain ocean environment]

I

n the real world, systems designed to extract signals from noisy measurements are plagued by errors evolving from constraints of the sensors employed, by random disturbances and noise, and, probably the most common, by the lack of precise knowledge of the underlying physical phenomenology generating the process in the first place. Thus, there is a strong need to incorporate any and all of the a priori knowledge available. Methods capable of extracting the desired signal from hostile environments require approaches that capture all of the information available and incorporate it into a processing scheme. This approach is model based [1], employing mathematical representations of the component processes involved, and the processor design evolves from the realm of statistical signal processing using a Bayesian approach. When signals are deeply buried in noise, especially as in ocean acoustics, then processing techniques utilizing these representations of the underlying phenomenology must be employed [2]. Here we develop the Bayesian model-based approach for processing highly uncertain ocean acoustic data. The Bayesian approach to statistical signal processing includes the next generation of processors that have recently been realized with the advent of high-speed/high-throughput computers [3]. A brief tutorial of Bayesian nonlinear statistical signal processing techniques is presented commencing with an overview of Bayesian techniques and sequential processors [3]. Once the evolving Bayesian paradigm is established, simulation-based methods using sampling theory and Monte Carlo (MC) realizations are discussed [3]–[11]. Here the usual limitations of

1053-5888/07/$25.00©2007IEEE

IEEE SIGNAL PROCESSING MAGAZINE [73] MAY JULY 2007

1053-5888/07/$25.00©2007IEEE

© PHOTO CREDIT

Bootstrap Particle Filtering

row posterior. The final step, after nonlinear approximations and nonIN THE REAL WORLD, obtaining the posterior, is to decide on Gaussian processes prevalent in clasand extract meaningful statistics yieldsical nonlinear processing SYSTEMS DESIGNED ing the solution of the signal processalgorithms (e.g., approximate TO EXTRACT SIGNALS FROM ing problem. Kalman filters) are no longer a NOISY MEASUREMENTS ARE We define the unobserved signal or restriction to perform Bayesian proPLAGUED BY ERRORS equivalently hidden variables as the cessing. It is shown how the underlyEVOLVING FROM set of Nx -vectors, {x(t )}, t = 0, . . . , N ing problem variables are easily CONSTRAINTS OF THE assimilated into the Bayesian conand the observables or equivalent SENSORS EMPLOYED. struct. Next, importance sampling measurements as the set of Ny -vecmethods are discussed and how they tors, {y(t )}, t = 0, . . . , N considered can be extended to sequential soluto be conditionally independent of the tions is shown, implemented using Markovian models as their signal variables. The goal in sequential Bayesian estimation is to natural evolution [4]–[6]. With this in mind, the concept of a sequentially estimate the conditional posterior distribution, Pr(x(0), . . . , x(N)|y(0), . . . , y(N)). Once the posterior is estiparticle filter (PF), a discrete nonparametric representation of a probability distribution, is developed and shown how it can be mated, then many of the interesting statistics characterizing the implemented in a bootstrap manner using sequential imporprocess under investigation can be exploited to extract meaningtance sampling/resampling (SIR) methods to perform statistical ful information. inferences yielding a suite of popular estimators such as the We start by defining two sets of random (vector) processes: X t := {x(0), . . . , x(t )} and Yt := {y(0), . . . , y(t )}. Here we can conditional expectation, maximum a-posteriori, and median filters [7]–[15]. consider X t to be the set of dynamic random variables or paramWith the generic bootstrap PF developed, we investigate the eters of interest and Yt as the set of measurements or observapassive synthetic aperture problem from the signal processing tions of the desired process. We start with Bayes’ theorem for perspective. We briefly define the problem and then develop a the conditional distribution as mathematical representation of a towed hydrophone sensor array in the ocean coupled to targets in random motion. Next, Pr(Yt|X t ) × Pr(X t ) Pr(X t|Yt ) = . (1) we develop a pragmatic simulation using this nonlinear spacePr(Yt ) time model discussing some of the important intricacies embedded within. We design the bootstrap PF for this problem and apply it to synthesized hydrophone data. In Bayesian theory, the posterior defined by Pr(X t|Yt ) is decomposed in terms of the prior Pr(X t ), its likelihood Pr(Yt|X t ), and BAYESIAN APPROACH TO SIGNAL PROCESSING the evidence or normalizing factor, Pr(Yt ). Each has a particuModern statistical signal processing techniques evolve directly lar significance in this construct. from a Bayesian perspective, i.e., they are cast into a probabilistic It has been shown [3], [4] that under Markovian assumptions framework using Bayes’ theorem as the fundamental construct. the distribution can be expressed, sequentially, as More specifically, the information about a random signal, x(t ), required to solve a vast majority of estimation/processing probWeight New Old          lems is incorporated in the underlying probability distribution Pr(X t|Yt ) = W(t, t − 1) × Pr(X t−1 |Yt−1 ), (2) generating the process. For instance, the usual signal enhancement problem is concerned with providing the best (in some sense) estimate of the signal at time t based on all of the data and the weight is defined by available at that time. The corresponding distribution provides   that information directly in terms of its underlying statistics. Pr(y(t )|x(t )) × Pr(x(t )|x(t − 1)) W(t, t − 1) := . That is, by calculating the statistics of the process directly from Pr(y(t )|Yt−1 ) its underlying distribution an enhanced signal can be extracted using a variety of estimators as well as evaluated by its set of performance statistics [1], [2]. This result is satisfying in the sense that we need only know Bayesian signal processing consists of three steps with the the posterior distribution at the previous stage, t − 1, scaled by primary objective of estimating the posterior (after data is availa weighting function to sequentially propagate the posterior to able) distribution governing the signal, x(t ), under investigathe next stage. Even though this expression provides the full tion. The first step is gathering the underlying information posterior solution, it is not physically realizable unless the dis(priors) governing evolution. The next step is to convert the tributions are known in closed form and the underlying multiˆ )|Yt ), using Bayes’ underlying posterior distribution, Pr(x(t ple integrals or sums can be analytically determined. In fact, a more useful solution is the marginal posterior distribution [3], rule [3] and develop the algorithm for posterior estimation. It [4] given by the update recursion as can be thought of as transforming the broad prior to the nar-

IEEE SIGNAL PROCESSING MAGAZINE [74] JULY 2007

Likelihood

Prior

      Posterior    Pr(y(t )|x(t )) × Pr(x(t )|Yt−1 ) Pr(x(t )|Yt) = , Pr(y(t )|Yt−1 )   

(3)

[TABLE 1] SEQUENTIAL BAYESIAN PROCESSOR FOR FILTERING POSTERIOR. PREDICTION 

Evidence

Pr(x(t )|Yt−1 ) =

where we can consider the update or filtering distribution as a weighting of the prediction distribution as in the full case above, i.e.,

Pr(x(t )|x(t − 1)) × Pr(x(t − 1)|Yt−1 )dx(t − 1) UPDATE/POSTERIOR

Pr(x(t )|Yt ) = Pr(y(t )|x(t )) × Pr(x(t )|Yt−1 )/Pr(y(t )|Yt−1 ) INITIAL CONDITIONS

Update

Weight

Prediction

         Pr(x(t )|Yt ) = Wu(t, t − 1) × Pr(x(t )|Yt−1 ), where the weight in this case is defined by Wu(t, t − 1) :=

Pr(y(t )|x(t )) . Pr(y(t )|Yt−1 )

We summarize the sequential Bayesian processor in Table 1. These two sequential relations form the theoretical foundation of many of the sequential PF designs. MONTE CARLO APPROACH MC methods are stochastic computational algorithms capable of efficiently simulating highly complex systems. Historically, motivated by games of chance and encouraged by the development of the first electronic computer, the MC approach was conceived by Ulam (1945), developed by Ulam and von Neumann (1947), and coined by Metropolis (1949). It evolved in the 1940s during the Manhattan project by scientists investigating calculations for atomic weapon designs [6]. The method evolved from such areas as computational physics, biology, chemistry, mathematics, engineering, materials, and finance, to name a few. MC methods offer an alternative approach to solving classical numerical integration and optimization problems. It utilizes Markov chain theory as its underlying foundation, establishing the concept that through random sampling, the resulting empirical distribution converges to the desired posterior called the stationary or invariant distribution of the chain. Markov chain MC techniques are based on sampling from probability distributions based on a Markov chain, which is a stochastic (state-space) system governed by a transition probability, having the desired posterior distribution as its invariant distribution. Thus, under certain assumptions, the chain converges to the desired posterior through proper random sampling as the number of samples become large (see [4] and [6] for details)—a crucial property. In signal processing, we are interested in some statistical measure of a random signal or parameter usually expressed in terms of its moments. For example, suppose we have some signal function f(X) with respect to some underlying probabilistic distribution Pr(X), then a typical measure to seek is its performance on the average, which is characterized by the expectation  EX { f(X)} =

f(X)Pr(X)dX.

Pr(x(0)|Y0 )

(4)

(5)

Instead of attempting to use numerical integration techniques, stochastic sampling techniques known as MC integration have evolved. The key idea embedded in the MC approach is to represent the required distribution as a set of random samples rather than a specific analytic function (e.g., Gaussian). As the number of samples becomes large, they provide an equivalent (empirical) representation of the distribution, enabling moments to be estimated directly (inference). MC integration draws samples from the required distribution and then forms sample averages to approximate the sought-after distributions. Thus, MC integration evaluates (5) by drawing samples, {X(i)} from Pr(X) with ∼ defined as drawn from the designated distribution. Assuming perfect sampling, this produces the estimated or empirical distribution given by N 1  ˆ Pr(X) ≈ δ(X − X(i )), N i =1

which is a probability mass distribution with weights, 1/N and random variable or location X(i ). Substituting the empirical distribution into the integral gives  EX { f(X)} =

N 1  ˆ f(X)Pr(X)dX ≈ f(X(i )) ≡ f, N i =1

(6)

which follows directly from the sifting property of the delta function. Here f is said to be an MC estimate of EX { f(X)}. A generalization to the MC approach is known as importance sampling, which evolves from  

 I=

X

p(x)dx =

X

for



p(x) q(x)

 × q(x) dx

q(x)dx = 1.

(7)

Here q(x) is referred to as the sampling distribution or more appropriately, the importance sampling distribution, since it samples the target distribution, p(x), nonuniformly giving more importance to some values of p(x) than others. We say that the support of q(x) covers that of p(x), i.e., the samples drawn from q(· ) overlap the same region (or more) corresponding to the samples of p(· ). The integral in (7) can be estimated by the following:

IEEE SIGNAL PROCESSING MAGAZINE [75] JULY 2007



N 1  δ(x − X(i )) N i =1

X(i ) ∼ q(x) and qˆ (x) ≈ ■

   1 Pr(Yt|X t )Pr(X t ) f(X t ) q(X t|Yt )dX t Pr(Yt ) q(X t|Yt )  1 = (12) W(t ) f(X t )q(X t|Yt ) dX t, Pr(Yt )

ˆ )= f(t

draw N-samples

compute the sample mean  I = Eq

p(x) q(x)

 

 ≈

p(x) q(x)

 ×

which is simply the expectation of the weighted function Eq{W(t ) f(X t )} scaled by the normalizing constant. From this definition of the new weighting function, we have

N 1  δ(x − X(i ))dx N i =1

W(t )q(X t|Yt ) = Pr(Yt|X t )Pr(X t ).

N p(X(i )) 1  = . N i =1 q(X(i ))

Thus, we can now replace the troublesome normalizing constant of (12) by using (13) to replace the integrand of (11), i.e.,

The art in importance sampling is in choosing the importance distribution q(· ), which approximates the target distribution p(· ) as closely as possible. This is the principal factor effecting performance of this approach, since variates must be drawn from q(x) that cover the target distribution. Using the concepts of importance sampling, we can approximate the posterior distribution with a function on a finite discrete support. Since it is usually not possible to sample directly from the posterior, we use importance sampling coupled with an easy to sample proposal distribution, say q(X t|Yt )—this is the crucial choice and design step required in Bayesian importance sampling methodology. Therefore, starting with a function of the set of variables, say f(X t ), we would like to estimate its mean using the importance concept, i.e.,  E{ f(X t )} =

f(X t ) × Pr(X t|Yt ) dX t,

(8)

where Pr(X t|Yt ) is the posterior distribution. Using the MC approach, we would like to sample from this posterior directly and then use sample statistics to perform the estimation. Therefore, we insert the proposal importance distribution q(X t|Yt ) as before ˆ ) := E{ f(X t )} = f(t





Pr(X t|Yt ) q(X t|Yt ) × q(X t|Yt ) dX t.

ˆ ) = Eq{W(t ) f(X t )} = Eq{W(t ) f(X t )} f(t Pr(Yt ) W(t )q(X t|Yt )dX t Eq{W(t ) f(X t )} = . Eq{W(t )}

qˆ (X t|Yt ) ≈

(15)

Wi (t ) for W i (t ) := N i =1 Wi (t ) Pr(Yt|X t(i )) × Pr(X t(i )) Wi (t ) = , q(X t(i )|Yt )

(16)

we obtain the final estimate N 

W i (t ) × f(X t(i )).

(17)

i=1

(9)

(10)

˜ ) is not useful because it requires knowledge Unfortunately, W(t of the evidence or normalizing constant Pr(Yt ) given by  Pr(Yt|X t ) × Pr(X t )dX t,

N 1  δ(X t − X t(i )), N i =1

and therefore substituting, applying the sifting property of the Dirac delta function, and defining the normalized weights

f(X t )

˜ ) := Pr(X t|Yt ) = Pr(Yt|X t ) × Pr(X t ) . W(t q(X t|Yt ) Pr(Yt ) × q(X t|Yt )

(14)

Now drawing samples from the proposal X t(i ) ∼ q(X t|Yt ) and using the MC approach leads to the desired result. That is, from the perfect sampling distribution, we have that

ˆ )≈ f(t



Now we apply Bayes’ rule to the posterior distribution and define a weighting function as

Pr(Yt ) =

(13)

(11)

which is usually not available. But by substituting (10) into (9) and defining a new weight W(t ), we obtain

The importance estimator is biased being the ratio of two sample estimators [as in (14)], but it can be shown that it asymptotically converges to the true statistic and the central limit theorem holds [5], [6]. Thus, as the number of samples increase (N → ∞), an asymptotically optimal estimate of the posterior is ˆ t|Yt ) ≈ Pr(X

N 

W i (t ) × δ(X t − X t(i )),

(18)

i =1

which is the goal of Bayesian estimation. Note that the new ˜ ), where ∝ is defined as “proportional to” weight, W(t ) ∝ W(t up to a normalizing constant. The importance distribution can be modified to enable a sequential estimation of the desired posterior distribution, i.e., ˆ t−1 |Yt−1 ) using importance we estimate the posterior Pr(X weights W(t − 1). As a new sample becomes available, we

IEEE SIGNAL PROCESSING MAGAZINE [76] JULY 2007

(states) from noisy measurement data. estimate the new weights W(t ) leadThe Markovian state vector with iniing to an updated estimate of the posMODERN STATISTICAL SIGNAL ˆ t|Yt). This means that to tial distribution Pr(x(0)) propagates terior Pr(X PROCESSING TECHNIQUES temporally throughout the state space obtain the new set of samples, EVOLVE DIRECTLY FROM X t(i ) ∼ q(X t|Yt ) sequentially, we according to the conditional probaA BAYESIAN PERSPECTIVE. bilistic transition distribution must use the previous set of samples, Pr(x(t )|x(t − 1)), while the condiX t−1 (i ) ∼ q(X t−1 |Yt−1 ) . Thus, with tionally independent measurements this in mind, the importance distribuevolve from the likelihood distribution Pr(y(t )|x(t )). We see tion q(X t|Yt ) must admit a marginal distribution q(X t−1 |Yt−1 ), that the dynamic state variable at time t is obtained implying a Bayesian factorization through the transition probability based on the previous q(X t|Yt ) = q(x(t ), X t−1 |Yt ) state (Markovian property), x(t − 1), and the knowledge of (19) = q(X t−1 |Yt−1 ) × q(x(t )|X t−1 , Yt ). the underlying conditional probability. Once propagated to time t, the dynamic state variable is used to update or correct based on the likelihood probability and the new measThis type of importance distribution leads to the desired sequenurement, y(t ). Symbolically, this evolutionary process is tial solution [7]. Recall the Bayesian solution of (2). Recognizing x(t − 1) ∼ Pr(x(t )|x(t − 1)); {x(t ), y(t )} ∼ Pr(y(t )|x(t )). the denominator as just the evidence or normalizing distribuThe usual model-based constructs of the dynamic state varition and not a function of X t, we have ables indicate that there is an equivalence between the probabilisPr(X t|Yt) ∝ Pr(y(t )|x(t )) × Pr(x(t )|x(t − 1)) tic distributions and the underlying state/measurement transition × Pr(X t−1 |Yt−1 ). (20) models. The functional discrete Markovian (state) representation is given by the models: x(t ) = A(x(t − 1), u(t − 1), w(t − 1)) and y(t ) = C(x(t ), u(t ), v(t )), where w and v are the respective Substituting this expression for the posterior in the weight relaprocess and measurement noise sources with u a known input. tion, we have Here A(·) is the nonlinear (or linear) dynamic state transition function and C(·) the corresponding measurement function. Pr(X t|Yt ) W(t ) = = Pr(y(t )|x(t )) Both conditional probabilistic distributions embedded within q(X t|Yt ) the Bayesian framework are completely specified by these funcPr(x(t )|x(t − 1)) Pr(X t−1 |Yt−1 ) × , (21) × tions and the underlying noise distributions Pr(w(t − 1)) and q(x(t )|X t−1 , Yt ) q(X t−1 |Yt−1 )    Pr(v(t )). That is, we have the equivalence Previous Weight

A(x(t − 1), u(t − 1), w(t − 1)) ⇒ Pr(x(t )|x(t − 1))

which can be written simply as W(t ) = W(t − 1) ×

⇔ A(x(t )|x(t − 1))

Pr(y(t )|x(t )) × Pr(x(t )|x(t − 1)) , q(x(t )|X t−1 , Yt )

(22)

giving us the desired relationship—a sequential updating of the weight at each time-step. These results then enable us to formulate a generic Bayesian sequential importance sampling algorithm as follows: ■ draw samples from the proposed importance distribution: xi (t ) ∼ q(x(t )|X t−1 , Yt ) ■ determine the required conditional distributions: Pr(xi (t )|x(t − 1)), Pr(y(t )|xi (t )) ■ calculate the unnormalized weights: Wi (t ) with x(t ) = xi (t ) ■ normalize the weights: W i (t ) ˆ t|Yt ) = ■ estimate the posterior distribution: Pr(X N i=1 W i (t )δ(x(t ) − xi (t )). Once the posterior is estimated, then desired statistics evolve directly. BAYESIAN APPROACH TO THE STATE SPACE Bayesian estimation relative to the state-space models is based on extracting the unobserved or hidden dynamic internal variables

C(x(t ), u(t ), v(t )) ⇒ Pr(y(t )|x(t )) ⇔ C(y(t )|x(t )). (23) Thus, the state-space model along with the noise statistics and prior distributions define the required Bayesian representation or probabilistic propagation model, defining the evolution of the states and known inputs through the transition probabilities. Here, the dynamic state variables propagate throughout the state space specified by the transition probability (A(x(t )|x(t − 1))) using the embedded process model. That is, the unobserved state at time t − 1 depends on the transition probability distribution to propagate to the state at time t. Once evolved, the state combines under the corresponding measurement at time t through the conditional likelihood distribution (C(y(t )|x(t ))) using the embedded measurement model to obtain the required likelihood distribution. These events continue to evolve throughout with the states propagating through the state transition probability using the process model and the measurements generated by the states and likelihood using the measurement model. Analytically, to generate the model-based version of the sequential Bayesian processor, we replace the transition and

IEEE SIGNAL PROCESSING MAGAZINE [77] JULY 2007

likelihood distributions with the conditionals of (23). The solution to the signal enhancement or equivalently state estimation problem is given by the filtering distribution Pr(x(t )|Yt), which was solved previously (see Table 1). We start with the prediction recursion characterized by the Chapman-Kolmogorov equation [6], replacing transition probability with the implied modelbased conditional, i.e.,

Probability

BAYESIAN PARTICLE FILTERS Particle filtering is a sequential MC method employing the sequential estimation of relevant probability distributions using the concepts of importance sampling and the approximations of distributions with discrete random measures [7]–[11]. The key idea is to represent the required posterior distribution by a set of Np-random samples, the particles, and associated weights, {xi (t ), W i (t )}; i = 1, · · · , Np and compute the required MC estimates. Of course, as the number of samples become very Process Model  Embedded    large, the MC representation becomes an equivalent characteriPr(x(t )|Yt−1 ) = A(x(t )|x(t − 1)) zation of the posterior distribution. Thus, particle filtering is a Prior technique to implement sequential Bayesian processors by MC    simulation. It is an alternative to approximate Kalman filtering × Pr(x(t − 1)|Yt−1 ) dx(t − 1). (24) for nonlinear problems [1], [2], [7]. In particle filtering, continuous distributions are approximated by discrete random measNext, we incorporate the model-based likelihood into the posteures composed of these weighted particles where the particles rior equation with the understanding that the process model are actually samples (realizations) of the unknown or hidden has been incorporated into the prediction states from the Markovian state-space and the weights are the associated probability masses estimated using the Bayesian Embedded Measurement Model    recursions as shown in Figure 1. From the figure, we see that C(y(t )|x(t )) Pr(x(t )|Yt) = associated with each particle xi (t ) is a corresponding weight or Prediction (probability) mass W i (t ). Therefore, knowledge of this random    × Pr(x(t )|Yt−1 ) Pr(y(t )|Yt−1 ). (25) measure {xi (t ), W i (t )} characterizes an estimate of the empirˆ )|Yt ) at a particular instant of ical posterior distribution Pr(x(t time t. Importance sampling plays a crucial role in state-space Thus, we see from the Bayesian perspective that the sequential particle algorithm development. Particle filtering does not Bayesian processor employing the state-space representation of involve linearizations around current estimates but, rather, (23) is straightforward. approximations of the desired distributions by these discrete measures. In comparison, the approximate Kalman Wi(t) filter recursively estimates the condiNP tional mean and covariance that can be ˆ X Y ) = ΣW (t)δ(X(t)− X (t)) Pr( 0.06 used to characterize the filtering postet t i i i=1 rior Pr(x(t )|Yt ) under Gaussian assumptions [1]. In summary, a PF is a sequential MC based point mass representation of probability distributions. It only requires a Markovian state-space 0.04 representation of the underlying process W8(t) to provide a set of particles that evolve at each time step leading to an instantaneous approximation of the target posterior distribution of the state at time t given all of the data up to that time. Figure 1 illustrates the evolution of the posterior at a particular time step. Here 0.02 we see the estimated posterior based on 22 particles (nonuniformly spaced), and we select the eighth particle and weight Xi(t) 0 1 23456 7 8 9 1011 12 13 14 15 16 17 18 19 20 21 22 to illustrate the instantaneous approxiX8(t) mation at time t for xi versus ˆ Pr(x(t )|Yt ). The full posterior over six Particle Number (Time-Step 68) time steps is shown in Figure 2, where each individual time slice is displayed. [FIG1] PF representation of posterior probability distribution in terms of weights (probabilities) and particles (samples). The associated maximum a posteriori

IEEE SIGNAL PROCESSING MAGAZINE [78] JULY 2007

Probability [ Wi (t)] ∧

XMAP(t)

Pr(Xi(t6) | Yt)

t6



XMAP(t)

XMAP(t6) Pr(Xi(t5) | Yt)

t5



XMAP(t5) Pr(Xi(t4) | Yt)

t4



Tim

e-S

tep

XMAP(t4) Pr(Xi(t3) | Yt)

t3



XMAP(t3) Pr(Xi(t2) | Yt)

t2



XMAP(t2) Pr(Xi(t1) | Yt) ∧

XMAP(t1)

t1

∧ Particle Number | Xi(t)

[FIG2] Ensemble of instantaneous posterior probability distributions in terms of probabilities (weights), particles and time-steps ˆ )|Yt ) versus Xi (t ) versus t). MAP inference annotated on ensemble along with the corresponding dynamic estimate, Xˆ MAP (t ) (Pr(x(t (left axis).

(MAP) estimates are illustrated across the ensemble along with the actual dynamic estimates extracted at each timestep (left side plot). Thus, statistics are calculated across the ensemble created over time to provide the inference estimates of the states. For example, the minimum meansquared error (MMSE) estimate is easily determined by averaging over xi (t ), since  xˆ mmse (t ) =

 x(t )Pr(x(t )|Yt )dx ≈

ˆ xˆ MAP (t ) = max{Pr(x(t )|Yt )}. xi

(26)

The corresponding state-space PF (SSPF) evolving from this construct follows directly after sampling from the importance distribution, i.e.,

ˆ x(t )Pr(x(t )|Yt )dx

p 1  x(t )W i (t )δ(x(t ) − xi (t )) = Np i =1

N

p 1  W i (t )xi (t ), Np i =1

N

=

sample corresponding to the maximum weight of xi (t ) across the ensemble at each time step, i.e.,

while the MAP estimate is simply determined by finding the

xi (t ) ∼ q(x(t )|x(t − 1), y(t )) C(y(t )|xi (t )) × A(xi (t )|xi (t − 1)) Wi (t ) = Wi (t − 1) × q(xi (t )|xi (t − 1), y(t )) Wi (t ) , (27) W i (t ) = N p i =1 Wi (t ) and the filtering posterior is estimated by

IEEE SIGNAL PROCESSING MAGAZINE [79] JULY 2007

Np 

the measurement likelihood C(y(t )|xi (t )); i = 1, . . . , Np for the sampled particle set. These weights require the particles to i =1 be propagated to time t before the weights can be calculated. This choice can lead to problems, however, since the transition prior is not conditioned on the measurement BOOTSTRAP PARTICLE FILTER data, especially the most recent, y(t ). Failing to incorporate The basic design tool when developing the particle filtering algothe latest available information from the most recent measrithms is the choice of the importance sampling distribution, q(· ). One of the most popular realizations of this approach was urement to propose new values for the states leads to only a few particles having significant weights when their likelideveloped by using the transition prior as the importance proposhood is calculated. The transitional prior is a much broader al [3]. This prior is defined in terms of the state-space representadistribution than the likelihood indicating that only a few tion by A(x(t )|x(t − 1)) → A(x(t − 1), u(t − 1), w(t − 1)) , particles will be assigned a large weight. The algorithm will which is dependent on the known excitation and process noise degenerate rapidly. Thus, the SSPF algorithm takes the same statistics. It is given by generic form as before with the importance weights much qprior (x(t )|x(t − 1), Yt ) −→ Pr(x(t )|x(t − 1)) simpler to evaluate with this approach. It has been called the −→ A(x(t )|x(t − 1)). bootstrap PF, the condensation PF, or the survival of the fittest algorithm [3], [12]. As mentioned, one of the major problems with the imporSubstituting this choice into the expression for the weights tance sampling algorithms is the depletion of the particles, i.e., gives they tend to increase in variance at each iteration. The degeneraC(y(t )|xi (t )) × A(x(t )|xi (t − 1)) cy of the particle weights creates a problem that must be resolved Wi (t ) = Wi (t − 1) × qprior (x(t )|xi (t − 1), Yt ) before these particle algorithms can be of any pragmatic use in = Wi (t − 1) × C(y(t )|xi (t )). applications. The problem occurs because the variance of the importance weights can only increase in time [3] thereby making it impossible to avoid this weight degradation. Degeneracy Note two properties resulting from this choice of importance implies that a large computational effort is devoted to updating distribution. First, the proposed importance distribution does particles whose contribution to the posterior is negligible. Thus, not use the most recent observation, y(t ), and second, this there is a need to somehow resolve this problem to make the choice is easily implemented and updated by simply evaluating simulation-based techniques viable. This requirement leads to the idea of resampling the particles. The main objective in simulation-based State-Space SIR Algorithm sampling techniques is to generate indeInitialize pendent identically distributed (i.i.d.) samPr[x(t−1)Yt−1] {xi(t−1), Wi(t−1)} ples from the targeted posterior distribution to perform statistical inferences extracting Predict the desired information. Thus, the imporA(x(t)x(t−1)) tance weights are quite critical since they Pr[x(t)Yt−1] Data y(t) contain probabilistic information about each {xˆi(t), Wˆ i(t)} specific particle. In fact, they provide us with Update information about how probable a sample t⇒t+1 C(y(t)x(t)) has been drawn from the target posterior [3], [4]. Therefore, the weights can be considered No {xi(t), Wi(t)} Pr[x(t)Yt] Resample acceptance probabilities enabling us to genResample erate independent (approximately) samples xi(t) ⇒ xˆi(t) from the posterior Pr(x(t )|Yt ). The empiriˆ i(t) Wi(t) ⇒ W ˆ )|Yt ) is defined over a cal distribution Pr(x(t set of finite (Np ) random measures {xi (t ), Wi (t )}; i = 1, . . . , Np, approximatYES New Sample ing the posterior. Resampling, therefore, can be thought of as a realization of enhanced {xi(t), Wi(t)} Pr[x(t)Yt] Output particles xˆ k(t ) extracted from the original samples xi (t ) based on their acceptance probability Wi (t ) at time t, i.e., statistically, [FIG3] State-space SIR (bootstrap) particle filtering algorithm structure: initialization, prediction (state transition), updating (likelihood), resampling. we have ˆ Pr(x(t )|Yt ) ≈

W i (t ) × δ(x(t ) − xi (t )).

(28)

IEEE SIGNAL PROCESSING MAGAZINE [80] JULY 2007

Pr( xˆ k(t ) = xi (t )) = Wi (t )

for i = 1, . . . , Np

(29)

We define the acoustic array space-time processing problem as:

Given a set of noisy pressure-field measurements from a horor symbolically, xˆ k(t ) ⇒ xi (t ), with the set of new particles { xˆ k(t )} izontally towed array of L-sensors in motion, find the best replacing the old set {xi (t )} and ⇒ is defined as resampled from. (minimum error variance) estimate The fundamental concept in of the target bearing. resampling theory is to preserve particles with large weights (large probaONE OF THE MAJOR We use the following nonlinear bilities) while discarding those with PROBLEMS WITH THE pressure-field measurement model small weights. Two steps must occur IMPORTANCE SAMPLING for M monochromatic {plane wave} to resample effectively: 1) a decision, ALGORITHMS IS THE targets characterized by a correon a weight-by-weight basis, must be DEPLETION OF THE PARTICLES. sponding set of temporal frequenmade to select the appropriate cies, bearings, and amplitudes, weights and reject the inappropriate [{ω m }, {θ m }, {am }]; i.e., and 2) resampling must be performed to minimize the degeneracy. The overall strategy, when couM  pled with importance sampling is termed SIR [3], [4]. A reap(x, tk) = am e jω m tk − jβ(x,tk ) sin θ m + n(tk), (31) sonable measure of degeneracy is the effective particle sample m =1 size [6] defined by where β(x, tk) := ko x(to) + vtk , ko = (2π/λo) is the Np 1 wavenumber, x(tk) is the current spatial position along the xNeff (t ) := ≤ Np, (30) ≈ N p 2 Eq{W 2 (X t )} axis in meters, v is the tow speed in m/s, and n(tk) is the addii =1 Wi (t ) tive random noise. The inclusion of motion in the generalized wave number β is critical to the improvement of the processand the decision is made by comparing it to a threshold, Nthresh , ing, since the synthetic aperture effect is actually created i.e., we use the rejection method [6] to decide whether or not to through the motion itself and not simply the displacement. resample If we further assume that the single sensor equation above is  Resample ≤ Nthresh expanded to include an array of L-sensors, x −→ x , ˆ Neff (t ) = Accept > Nthresh .  = 1, . . . , L; then we obtain Once the decision is made to resample, a uniform sampling procedure can be applied removing samples with low importance weights and multiplying samples with high importance weights, generating a set of new samples so that Pr( xˆ i (t ) = ˆ j(t ). The resulting i.i.d. samples are uniform such xj(t )) = W that the sampling induces the mapping of ˆ i (t )}, W ˆ i (t ) = 1/Np ∀i. Resampling {xi (t ), Wi (t )} → { xˆ i (t ), W decreases the degeneracy problem algorithmically but introduces its own set of problems. Theoretically, after one resampling step, the simulated trajectories are not statistically independent any longer; therefore, the simple convergence resulting under these assumptions lose their validity [3]. We summarize the bootstrap PF algorithm in Table 2 and illustrate the algorithm in Figure 3.

p(x , tk) =

M  m =1



am e jω m tk − jβ(x ,tk ) sin θ m + n (tk)

M  m =1

am cos(ω m tk − β(x , tk) sin θ m ) + n (tk) (32)

[TABLE 2] BOOTSTRAP SIR STATE-SPACE PARTICLE FILTERING ALGORITHM. INITIALIZE xi (0) ∼ Pr(x(0))

Wi (0) =

1 Np

i = 1, . . . , Np

[Sample]

IMPORTANCE SAMPLING

SYNTHETIC APERTURE PROCESSING FOR A TOWED ARRAY Synthetic aperture processing is well known in airborne radar but not as familiar in sonar [16]–[21]. The underlying idea in creating a synthetic aperture is to increase the array length by motion, thereby increasing spatial resolution (bearing) and gain in signal-to-noise ratio (SNR). It has been shown that for stationary targets, the motion induced bearing estimates have smaller variance than that of a stationary array [2], [19]. Here we extend this to the case of both array and target motion, resulting in a further increase in SNR.

xi (t ) ∼ A(x(t )|xi (t − 1)); wi ∼ Pr(wi (t ))

[State Transition]

WEIGHT UPDATE Wi (t ) = Wi (t − 1) × C(y(t )|xi (t ))

[Weights]

WEIGHT NORMALIZATION Wi (t )

Wi (t ) = Np

i=1

[Normalize]

Wi (t ) RESAMPLING

If Nˆ eff (t ) ≤ Nthresh , then xˆ i (t ) ⇒ xj (t )

IEEE SIGNAL PROCESSING MAGAZINE [81] JULY 2007

[Resample]

For this problem, we assume the additive noise sources are Gaussian, so we can compare results to the performance of the approximate processor. We will return to our original discrete notation tk+1 → t + 1 for the sampled-data representation.

since our hydrophone sensors measure the real part of the complex pressure field, the final nonlinear measurement model of the system can be written in compact vector form as p(tk) = c[tk; ] + n(tk),

(33)

BAYESIAN SEQUENTIAL PROCESSOR DESIGN Let us cast this problem into the sequential Bayesian framework, i.e., we would like to estimate the instantaneous posterior ˆ )|Yt ) using the PF representation to filtering distribution Pr(x(t be able to perform inferences and extract the target bearing estimates. Therefore, for the Bayesian processor and our problem, we have that the state transition probability is given by ((t ) → x(t ))

where p, c, n ∈ C L×1 , are the respective pressure-field, measurement and noise vectors and  ∈ R M×1 represents the target bearings and the corresponding vector measurement model c (tk; ) =

M  m =1

am cos(ω m tk − β(x , tk) sin θ m )

for  = 1, . . . , L.

Pr(x(t )|x(t − 1)) −→ A(x(t )|x(t − 1)) ∼ N ((t ) : a[(t − 1)], Rwθ ,wθ ),

Since we model the bearings as a random walk (constants ˙ = 0) and noise) emulating random target motion, then an ( augmented Markovian state-space model evolves from first differences as

or in terms of our state transition (bearings) model, we have (t ) = a[(t − 1)] + wθ (t − 1) = (t − 1) + wθ (t − 1) for Pr(wθ (t )) ∼ N (0, Rwθ wθ ).

(tk) = (tk−1 ) + wθ (tk−1 ).

(34)

The corresponding likelihood is specified in terms of the measurement model (y(t ) → p(t )) as

Amplitude

Bearing Number 1

Sensor Number 1

2 0

−2 −4 Sensor Number 2

2 0

−2 −4

40 Bearing Number 2

Amplitude

30 20

Sensor Number 3

2 0

−2

10

−4

0 −10

Amplitude

Amplitude

where we have used the notation z ∼ N (z : mz , Rzz ) to specify the Gaussian distribution in random vector z. In terms of our

4

80 75 70 65 60 55 50 45 40 35

−20 −30 −40

Pr(y(t )|x(t )) −→ C(y(t )|x(t )) ∼ N (y(t ) : c[(t )], Rvv(t )),

Amplitude

Amplitude

Thus, the state-space model is linear with no explicit dynamics, therefore, the process matrix A = I (identity) and the relations are greatly simplified. Now let us see how a PF using the bootstrap approach would be constructed according to the generic algorithm of Table 2.

Sensor Number 4

2 0

−2

0

0.5

1

1.5 Time (s)

2

2.5

−4

0

0.5

1

1.5

2

Time (s)

[FIG4] Synthetic aperture sonar tracking problem: simulated target motion from initial bearings of 45◦ and −10◦ and array measurements (−10 dB SNR).

IEEE SIGNAL PROCESSING MAGAZINE [82] JULY 2007

2.5

80 Estimated Bearing Number 1 70

Amplitude

60

Data MAP

50 40

EKF

30 40 Estimated Bearing Number 2 30 Data

Amplitude

20

MAP

10 0 −10 EKF

−20 −30 −40

0

0.5

1

Time (s)

1.5

2

2.5

[FIG5] PF bearing estimates: 45◦ PF bearing (state) estimates and simulated target tracks (EKF, MAP).

problem, we have that M  1 ln C(y(t )|x(t )) = κ − (y(t ) − am cos(ω m t 2 m =1

− β(t ) sin θ m )) R−1 vv   M  · y(t ) − am cos(ω m t − β(t ) sin θ m ) m =1

with κ a constant, β ∈ RL×1 , and β(t ) := [β(x1 , t )| . . . |β(xL, t )] , the dynamic wavenumber expanded over the array. Thus, the SIR algorithm becomes ■ draw samples (particles) from the state transition distribution:  i (t ) ∼ N ((t ) : a[(t − 1)], Rwθ ,wθ ) wθ i (t ) ∼ Pr(w(t )) ∼ N (0, Rwθ i wθ i ),  i (t ) =  i (t − 1) + wθ i (t − 1) ■ estimate the likelihood, C(y(t )|(t )) ∼ N (y(t ) :  c[(t )], Rvv(t )) with c (t;  i ) = M m=1 am cos(ω m tk − β(x , t ) sin  m,i (t )) for  = 1, . . . , L and  m,i is the ithparticle at the mth-bearing angle ■ update and normalize the weight: W i (t ) = Np Wi (t )/ i=1 Wi (t ) ˆ eff (t ) ≤ Nthresh ■ resample: N ˆ )|Yt ) ≈ ■ estimate the instantaneous posterior: Pr((t Np W (t )δ((t ) −  (t )) i i i =1

■ perform the inference by estimating the corresponding ˆ ˆ mmse (t ) = ˆ map (t ) = arg max Pr((t )|Yt ) ;  statistics:  Np ˆ ˆ median (t ) = E{(t )|Yt} = i=1  i (t )Pr((t )|Yt ) ; ˆ median(Pr((t )|Yt ))).

NARROWBAND SYNTHETIC APERTURE SIMULATION Consider the following simulation of the synthetic aperture using a four-element, linear towed array with two moving targets with the following parameters: ■ Targets: Unity amplitudes with temporal frequency is 50 Hz, wavelength = 30 m, speed = 5 m/s ■ Array: four-element linear towed array with 15 m pitch ■ PF: Nθ = 2 states (bearings), Ny = 4 sensors, N = 500 samples, Np = 1, 000 weights ■ SNR: −10 dB ■ Noise: White Gaussian with Rww = diag (2.5), Rvv = diag (0.1414); sampling interval 0.005 s ■ Initial Conditions: (bearings and uncertainty) o = [45° − 10°]  , Po = diag (10−10 ). The array simulation was executed as the target moved according to a random walk specified by the process noise and sensor array measurements with −10 dB SNR. The results are shown in Figure 4. In Figure 4(a) we see the noisy synthesized bearings and in Figure 4(b) we see four noisy sensor measurements at the moving array. The bearing (state) estimates are shown in Figure 5, where we observe the targets making a variety of course alterations. The PF is able to track the target motions quite well while we observe the extended Kalman filter

IEEE SIGNAL PROCESSING MAGAZINE [83] JULY 2007

4

MAP

PF Predicted Measurement Number 1

2 0 −2

EKF

Data 4

MAP

PF Predicted Measurement Number 2 2 0 −2

EKF

Data

4

MAP

PF Predicted Measurement Number 3

2 0 −2

Data

4

EKF

PF Predicted Measurement Number 4

MAP

2 0 −2 −4

EKF

Data 0

0.5

1

1.5

2

2.5

Time (s) [FIG6] PF measurement estimates: four channel hydrophone estimates.

(EKF) [1] unable to respond quickly enough and finally losing track completely. Both the MAP and MMSE estimates using the estimated posterior provide excellent tracking. Note that these bearing inputs would provide the raw data for an XY-tracker [1]. The PF estimated or filter measurements are shown in Figure 6. As expected, the PF tracks the measurement data quite well while the EKF is again in error. Using the usual optimality tests for performance demonstrates that the PF processor works well, since each measurement channel is zero-mean and white with the weighted sum square residual (WSSR) statistic lying below the threshold. This indicates a white innovations sequence, demonstrating the tracking ability of the PF processor, at least in a classical sense [1]. The instantaneous posterior distributions for both bearing estimates are shown in Figure 7. Here we see the Gaussian nature of the bearing estimates generated by the random walk. Clearly, the PF performs quite well for this problem. It is also interesting to note how the performance of this processor changes by changing the number of particles selected as well as the number of resampling steps required. These steps are controlled by the threshold (Nthresh ), i.e., if Neff < Nthresh , resampling occurs; otherwise, it does not. The performance is quantified by a measure of accuracy (RMS error) and computational cost (number resampling steps/run). We selected the number of particles as Np = 1,000, 500, and 250 with corresponding threshold, Nthresh = Np . As expected, in all three cases, (Np = 1,000, 500, 250 ), the minimum root mean-

squared error (RMS) occurred by resampling at every time-step (Nthresh = Np) with RMS state-error pairs for each respective Np given by: (RMS1 , RMS2 ) = (8%, 6%); (8.2%, 6.1%); (8.3%, 6.2%), indicating that the RMS state-error increases by using fewer and fewer particles and resampling at every time-step. Next, using Np = 1,000, we varied Nthresh in 100 particle increments from 500 down to 100 particles, demanding fewer resampling steps. The resulting RMS errors for each selected threshold were, respectively, (Nthresh = 500, 400, 300, 200, 100) −→ (RMS1 , RMS2 ) = (8%, 6%); (9%, 6.5%); (10%, 7.8%); (11%, 8.2%); (14.6%, 14.6%), with corresponding resampling steps of (Nresample = 500, 370, 187, 119, 67), clearly demontrating the tradeoff between accuracy and resampling. So we see by this analysis that as the number of resampling steps is diminished (threshold setting), the accuracy decreases but so does computational cost, as expected. In all cases, except the Nthresh = 100, the innovations sequence produced by the PF remained white and zero-mean, classical properties indicating successful performance [1]. Finally, for the 100 particle threshold, the performance deteriorated, as indicated by the nonwhite innovations and the increased RMS errors. SUMMARY In this article, we have provided an overview of nonlinear statistical signal processing based on the Bayesian paradigm. We showed that the next-generation processors are well founded on MC simulation-based sampling techniques. We reviewed the

IEEE SIGNAL PROCESSING MAGAZINE [84] JULY 2007

X-Updated Posterior Distribution State Number 1 X-Updated Posterior Distribution State Number 1

0.06 0.6 0.05

0.4

0.04

0.2

0.03

0.08

0.02

0.06

0

0.04 0.01

0.02

0 2.5

0 2.5 2

2

65

1.5 Tim

1

e(

s)

0.5 0

45

50 te pda

55 ple Sam

1.5 Tim e( s)

60 ber Num

1 0.5 0

−28

U

(a)

−26

−24

ate

Upd

−22

−20

−18

−16

ber um

N ple

Sam

(b)

[FIG7] PF instantaneous posterior bearing estimates: 45◦ and −10◦ targets.

development of the sequential Bayesian processor using the state-space models. The popular bootstrap algorithm was outlined and applied to an ocean acoustic synthetic aperture towed array target tracking problem to test the performance of a particle filtering technique. ACKNOWLEDGMENTS This work was performed under the auspices of the U.S. Department of Eneregy by the Lawrence Livermore National Laboratory under contract W-7405-ENG-48 while Dr. Candy was on sabbatical leave at the University of Cambridge, U.K., collaborating with Prof. Simon Godsill of the Signal Processing Group. The author would also like to thank the helpful and constructive comments of the reviewers who helped to significantly improve this manuscript. AUTHOR James V. Candy ([email protected]) is the chief scientist for engineering at Lawrence Livermore National Laboratory and an adjunct professor at the University of California, Santa Barbara. He is a Fellow of the IEEE and the Acoustical Society of America and Life Member (Fellow) at the University of Cambridge (Clare Hall College). He received the IEEE Distinguished Technical Achievement Award for the development of model-based signal processing in ocean acoustics and is an IEEE Distinguished Lecturer for oceanic processing. He authored Signal Processing: The Model-Based Approach and Signal Processing: The Modern Approach (McGraw-Hill, 1986 and 1988, respectively) and Model-Based Signal Processing (Wiley/IEEE Press, 2006).

REFERENCES

[1] J.V. Candy, Model-Based Signal Processing. Hoboken, NJ: Wiley, 2006. [2] E.J. Sullivan and J.V. Candy, “Space-time array processing: The model-based approach,’’ J. Acoust. Soc. Amer., vol. 102, no. 5, pp. 2809–2820, 1997. [3] A. Doucet, N. de Freitas, and N. Gordon, Sequential Monte Carlo Methods in Practice. New York: Springer-Verlag, 2001. [4] O. Cappe, E. Moulines, and T. Ryden, Inference in Hidden Markov Models. New York: Springer-Verlag, 2005. [5] M. West and J. Harrison, Bayesian Forecasting and Dynamic Models, 2nd ed. New York: Springer-Verlag, 1995. [6] J. Liu, Monte Carlo Strategies in Scientific Computing. New York: SpringerVerlag, 2001. [7] B. Ristic, S. Arulampalam, and N. Gordon, Beyond the Kalman Filter: Particle Filters for Tracking Applications. Norwood, MA: Artech House, 2004. [8] S. Godsill and P. Djuric, “Monte Carlo methods for statistical signal processing,” IEEE Trans. Signal Processing (Special Issue), vol. 50, no. 2, pp. 173–499, 2002. [9] P. Djuric, J. Kotecha, J. Zhang, Y. Huang, T. Ghirmai, M. Bugallo, and J. Miguez, “Particle filtering,” IEEE Signal Processing Mag. vol. 20, no. 5, pp. 19–38, 2003. [10] S. Haykin and N. de Freitas, “From Kalman filters to particle filters,” Proc. IEEE (Special Issue on Sequential State Estimation), vol. 92, no. 3, pp. 399–574, 2004. [11] A. Doucet and X. Wang, “Monte Carlo methods for signal processing,” IEEE Signal Processing Mag., vol. 24, no. 5, pp. 152–170, 2005. [12] N. Gordon, D. Salmond, and A.F.M. Smith, “Novel approach to nonlinear/non-Gaussian Bayesian state estimation, Proc. Inst. Elect. Eng., vol. 140, no. 2, pt. F, pp. 107–113, 1993. [13] G. Kitagawa and W. Gersch, Smoothness Priors Analysis of Time Series. New York: Springer-Verlag, 1996. [14] M. Isard and A. Blake, “Condensation–conditional density propagation for visual tracking,” Int. J. Comput. Vis., vol. 29, no. 1, pp. 5–28, 1998. [15] M.S. Arulampalam, S. Maskell, N. Gordon, and T. Clapp, “A tutorial on particle filters for nonlinear/non-Gaussian Bayesian tracking,” IEEE Trans. Signal Processing, vol. 50, no. 2, pp. 174–188, 2002. [16] R.E. Williams, “Creating an acoustic synthetic aperture in the ocean,’’ J. Acoust. Soc. Amer., vol. 60, no. 1, pp. 60–73, 1976. [17] N. Yen and W.M. Carey, “Applications of synthetic aperture processing to towed array data,’’ J. Acoust. Soc. Amer., vol. 86, no. 2, pp. 754–765, 1989. [18] S. Stergiopoulus and E.J. Sullivan, “Extended towed array processing by an overlap correlator,’’ J. Acoust. Soc. Amer., vol. 86, no. 1, pp. 158–171, 1989. [19] E.J. Sullivan, W.M. Carey, and S. Stergiopoulus, “Editorial in special issue on acoustic synthetic aperture processing,’’ IEEE J. Ocean. Eng., vol. 17, no. 1, pp. 1–7, 1992. [20] D.B. Ward, E.A. Lehmann, and R.C. Willianson, “Particle filtering algorithm for tracking and acoustic source in a reverberant environment,’’ IEEE Trans. Speech Audio Processing, vol. 11, no. 6, pp. 826–836, 2003. [21] M. Orton and W.C. Fitzgerald, “Bayesian approach to tracking multiple targets using sensor arrays and particle filters,’’ IEEE Trans. Signal Processing, vol. 50, no. 2, pp. 216–223, 2002. [SP]

IEEE SIGNAL PROCESSING MAGAZINE [85] JULY 2007