Sparsity-Inducing Direction Finding for Narrowband and ... - IEEE Xplore

3896

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 12, NO. 8, AUGUST 2013

Sparsity-Inducing Direction Finding for Narrowband and Wideband Signals Based on Array Covariance Vectors Zhang-Meng Liu, Member, IEEE, Zhi-Tao Huang, and Yi-Yu Zhou

Abstract—Among the existing sparsity-inducing direction-ofarrival (DOA) estimation methods, the sparse Bayesian learning (SBL) based ones have been demonstrated to achieve enhanced precision. However, the learning process of those methods converges much slowly when the signal-to-noise ratio (SNR) is relatively low. In this paper, we first show that the covariance vectors (columns of the covariance matrix) of the array output of independent signals share identical sparsity profiles corresponding to the spatial signal distribution, and their SNR exceeds that of the raw array output when moderately many snapshots are collected. Thus the SBL technique can be used to estimate the directions of independent narrowband/wideband signals by reconstructing those vectors with high computational efficiency. The method is then extended to narrowband correlated signals after proper modifications. In-depth analyses are also provided to show the lower bound of the new method in DOA estimation precision and the maximal signal number it can separate in the case of independent signals. Simulation results finally demonstrate the performance of the proposed method in both DOA estimation precision and computational efficiency. Index Terms—Direction-of-arrival (DOA) estimation, sparse reconstruction, relevance vector machine (RVM), covariance vector.

I. I NTRODUCTION

T

HE array output covariance matrix contains the directional information of the incident signals and well concentrates the signal energy distributed in all the snapshots, thus it is widely exploited in the direction-of-arrival (DOA) estimation methods for both narrowband [1] and wideband [2], [3] signals. In particular applications, the calculation of the covariance matrix may even greatly facilitate the measurement formulation of wideband signals [4]. However, most of the existing covariance matrix-based DOA estimation methods require the prior information of the incident signal number [1][3], and the spectral decomposition and focusing procedures of the wideband array outputs may introduce extra imperfections into the measurements [2], [3]. The sparse reconstruction techniques have attracted much interest recently in various areas including wireless communications, with special attention in this field paid to channel esti-

Manuscript received August 29, 2012; revised December 18, 2012 and February 27, 2013; accepted April 15, 2013. The associate editor coordinating the review of this paper and approving it for publication was A. Zajic. This work was supported in part by the National Natural Science Foundation (NO. 61072120). The authors are with the School of Electronic Science and Engineering, National University of Defense Technology, Changsha, 410073, China (email: zm [email protected]; [email protected]). Digital Object Identifier 10.1109/TWC.2013.071113.121305

mation [5]-[7], data gathering [8], interference estimation [9], multiuser detection [10], etc.. Another important application of those techniques lays in DOA estimation with antenna arrays [4], [11]-[14]. The sparsity-inducing DOA estimation methods make use of the spatial sparsity of the incident signals, and they succeed to estimate the signal directions by reconstructing the array outputs on a directional overcomplete dictionary. Previous simulation results have demonstrated the superiority of those methods in superresolution and robustness, especially in much demanding scenarios with low signal-to-noise ratio (SNR), limited snapshots and spatially adjacent signals [4], [11]-[14]. The existing sparsity-inducing DOA estimation methods mainly divide into two categories, the p -norm (0 ≤ p ≤ 1) based ones [4], [11]-[13] and the sparse Bayesian learning (SBL) based ones [14]. It has been demonstrated both theoretically and empirically that, the SBL technique [15] induces less structural error (biased global minimum) and convergence error (failure in achieving the global minimum) than the p -norm (0 ≤ p ≤ 1) based ones [16], [17], and it also performs well in capturing local signal properties to facilitate the refined DOA estimation process [14]. Several other literatures have also introduced the Bayesian idea to the field of array signal processing [18]-[20]. Nonetheless, the beamformers generally do not perform satisfyingly enough in superresolution [18], [19], and further research is required to apply the periodic cost function [20] in scenarios of multiple signals with unknown waveforms. A more relevant work with the method to be proposed is the relevance vector machine (RVM)-based DOA estimator [14]. The estimator has been shown to surpass its subspace-based and p -norm-based counterparts in adaptation to much demanding scenarios and DOA estimation precision. However, the simulation results in [14] also indicate a significant drawback of the RVM-DOA method, i.e., its computational efficiency deteriorates rapidly as the signal-to-noise ratio (SNR) decreases due to the slowed down convergence rate of the reconstruction process. In this paper, we calculate the covariance matrix of the array output first as is usually done in the subspace-based DOA estimators, and realize DOA estimation of independent narrowband/wideband signals and correlated narrowband signals by reconstructing the covariance vectors (columns of the covariance matrix) sparsely in the spatial domain, rather than exploiting the orthogonality between the signal- and noisesubspaces. The method to be proposed is also RVM-based

c 2013 IEEE 1536-1276/13$31.00

LIU et al.: SPARSITY-INDUCING DIRECTION FINDING FOR NARROWBAND AND WIDEBAND SIGNALS BASED ON ARRAY COVARIANCE VECTORS

as the RVM-DOA method [14] and we name it CV-RVM with CV denoting Covariance Vectors. The major motivation for us to resort to this strategy from that in [14] is that, the SNR of those vectors is higher than that of the raw array outputs when moderately many snapshots are collected, while the vectors still share identical sparsity profiles as the outputs. The enhanced SNR is expected to make up for the drawback of the RVM-DOA method in computational efficiency. Moreover, due to the structure differences of the covariance vectors and the raw array outputs, it is not an easy task to derive the CV-based DOA estimator straightforwardly from the work in [14], and special attentions should be paid to the implementation and properties of CV-RVM. The idea of estimating narrowband and wideband signal directions by reconstructing the covariance vectors can also be found in [12] and [4]. However, those methods can hardly obtain DOA estimates with satisfying precision due to the shortcomings of the p -norm reconstruction techniques they use, as we will show in the simulations of this paper. The rest of the paper mainly consists of six parts. Section II reviews the formulation of the array covariance vectors. In Section III, we analyze the first- and second-order statistics of the covariance vector estimation error in the case of limited snapshots, and show that the SNR of the vectors exceeds that of the raw array outputs if moderately many snapshots are collected. In Section IV, we introduce the SBL technique to estimate the directions of independent narrowband and wideband signals by reconstructing those vectors, and then extend the proposed method to correlated narrowband signals. The theoretical lower bound of the proposed method in DOA estimation precision and the maximal source number that it is able to separate in the case of independent signals are analyzed in Section V. Numerical examples are carried out in Section VI to demonstrate the performance of the proposed method. Section VII concludes the whole paper. II. M ODEL F ORMULATION Suppose that K stochastic Gaussian signals impinge onto an M -element array from directions of ϑ = [ϑ1 , · · · , ϑK ] simultaneously, the output of the mth sensor at time t is xm (t) =

K

(1)

sk (t + τk,m ) + vm (t) ,

k=1

where sk (t) is the kth signal waveform, τk,m is the propagating time-delay of this signal between the mth sensor and the reference, vm (t) is the independent white Gaussian noise with variance σ 2 . Suppose that the array sensors are located in the same 2-D plane, then τk,m = dTm gk ν with T gk = [sin ϑk , cos ϑk ] , dm being the location of the mth sensor and ν the propagation velocity of the waveforms. If sufficient snapshots are collected, one can obtain the perturbation-free covariance matrix R as R=

N 1 x (tn ) xH (tn ), N →+∞ N n=1

(2)

lim

T

H

T

where x (t) = [x1 (t) , · · · , xM (t)] , (•) and (•) are the conjugate transpose and transpose operators, respectively. The

3897

sampling rate of the array receiver is assumed to keep constant during the observation time, i.e., tn = (n − 1) Ts with Ts being the sampling interval. The (m1 , m2 )th element of R derived from (1) is given as follows, Rm1 ,m2 K K = E [sk (t + τk,m1 ) s∗k (t + τk ,m2 )] + σ 2 δ (m1 − m2 ) , k=1 k =1

m1 , m2 = 1, · · · , M, (3) ∗ where E (•) and (•) stand for the expectation and conjugate operators, respectively, and δ (•) is the indicator function. The structure of R can be simplified further by neglecting the correlation items in (3) in the case of independent narrowband/wideband = signals, where E [sk (t + τk,m1 ) s∗k (t + τk ,m2 )] 2 ηk rk (τk,m1 − τk,m2 ) δ (k − k ) with ηk = E |sk (t)| being the power of the kth signal, rk (τ ) being the unified self-correlation of this signal at time-delay τ that satisfies rk (0) = 1. Then the mth column of R for independent signals can be formulated as follows, T

ym = [R1,m , · · · , RM,m ] K = ηk bk,m + σ 2 em

(4)

k=1

= Bm η + σ 2 em , T

where bk,m = [rk (τk,1 − τk,m ) , · · · , rk (τk,M − τk,m )] , T Bm = [b1,m , · · · , bK,m ], η = [η1 , · · · , ηK ] , and em is a M × 1 vector with the mth element being 1 and the others being 0. The connection of bk,m to ϑk can be made more explicitly by combining τk,m = dTm gk ν to conclude in bk,m =

T rk (d1 − dm )T gk ν , · · · , rk (dM − dm )T gk ν , with dm , gk and ν defined in the same way as those in (1). Eq.(4) indicates that each column of R, after removing the item of σ 2 em , is a weighted summation of the K column vectors in Bm , and the vector formed by aligning the M columns of R one-after-another has the following expression, y = vec (R) = Bη + σ 2 e˜,

(5)

where vec (•) is the vectorization operator that forms a vector satisfying [y](m2 −1)×M+m1 = Rm1 ,m2 , B = T

T T B1 , · · · , BM and e˜ = vec (IM ). Eq.(5) gives a reshaped equation of the M equations given in (4) for m = 1, · · · , M . Since the vectors of b1,m , · · · , bK,m and matrices of B1 , · · · , BM and B rely on the time-delays of the signals through the array, they are signal direction-dependent, we can thus add the directional label to them as bk,m (ϑk ), Bm (ϑ) and B (ϑ) when necessary. Suppose that the incident signals are narrowband ones, or wideband ones with identical and known modulations as is assumed in [4] and [21], the columns of B rely only on the signal directions. Therefore, those directions can be estimated by recovering the K directional components from y. In this paper, we name both y1 , · · · , yM and y as covariance vectors, and seek to estimate the signal directions by decomposing them in the spatial domain.

3898


In the case of correlated signals, the columns of R also contain inter-signal correlation items and they cannot be simplified as that in (4). If the incident signals are wideband ones, those cross-correlation items rely on both the signal directions and multipath delays, which complicates the covariance vectors significantly. We skip over the correlated wideband scenarios due to space limitation and leave it for future research. However, when the incident signals are correlated narrowband ones, the cross-correlation items can be simplified as E [sk (t + τk,m1 ) s∗k (t + τk ,m2 )] = √ αk,k ηk ηk ξk,m1 ξk∗ ,m2 , where αk,k is the correlation coefficient between the kth and k th signals, f is the frequency shared by all the signals and ξk,m = exp (j2πf τk,m ). Then the mth column of R can be rewritten as K K √ αk,k ηk ηk ξk∗ ,m a (ϑk ) + σ 2 em ym = (6) k=1 k =1 = A (ϑ) um + σ 2 em , T

where um = [gm,1 , · · · , gm,K ] , gm,i =

K k=1

√ ∗ αi,k ηi ηk ξk,m ,

A (ϑ) = [a (ϑ1 ) , · · · , a (ϑK )] and a (ϑk ) = [ξk,1 , · · · , ξk,M ]T . The signal directions are used as identifiers in A (ϑ) and a (ϑk ) because the time delays are direction-dependent. It can be concluded from (6) that, after removing the item of σ 2 em , the covariance vectors of narrowband correlated signals also consist of K directional components, while the weight vector of um varies with the column index. Therefore, the directions of correlated narrowband signals can also be estimated by reconstructing the covariance vectors. Based on the above formulation of the covariance vectors, we propose a DOA estimator named covariance vector-based relevance vector machine, CV-RVM for short, to estimate the directions of independent narrowband/wideband signals and correlated narrowband signals. In the case of independent signals, the estimation error-contaminated counterpart of y is taken as the single measurement to accomplish the SMV (single measurement vector) implementation of CV-RVM. When the incident signals are correlated narrowband ones, the vectors of y1 , · · · , yM are reconstructed jointly for DOA estimation, which forms the multiple measurement vector (MMV) implementation of CV-RVM. In order to distinguish those two implementations of the new method, we name them SMV CV-RVM and MMV CV-RVM, respectively. III. P ROPERTIES OF THE A RRAY O UTPUT C OVARIANCE V ECTORS In this part, we first analyze the first- and second-order statistics of the estimation error of the covariance vectors caused by finite sampling. Then we compare the SNR of the covariance vectors with that of the raw array output, so as to partially verify the motivation for us to resort from the DOA estimator in [14] to the one in this paper. A. first- and second-order statistics of the covariance vector estimation error In practical applications, the covariance matrix can only be estimated using the N snapshots collected at time instants of

t = t1 , · · · , tN as follows, N ˆ= 1 R x (tn ) xH (tn ). N n=1

(7)

The covariance matrix estimate is estimation errorˆ − R, contaminated due to finite sampling. Denote E = R E = [ε1 , · · · , εM ] and ε = vec (E), then the covariance vectors can be formulated as yˆm = ym + εm and yˆ = y + ε, ˆ m , yˆ = vec R ˆ and the expressions of where yˆm = Re ym and y are given in (4)-(6). In order to better distinguish T v (t) = [v1 (t) , · · · , vM (t)] contained in the raw array output and the εm ’s in the covariance vector estimates, we call v (t) ”noise” and ε ”perturbation” or ”estimation error” in the rest of the paper. By combining (1) and (7), one can obtain the explicit ˆ by ˆ m1 ,m2 , i.e., the (m1 , m2 )th element of R, expression of R taking the effect of finite sampling into account as follows, ˆ m1 ,m2 R K K = k=1 k =1 K

+

+

k=1 K k=1

+

1 N

N

1 N 1 N

n=1 N

sk (tn + τk,m1 ) s∗k (tn + τk ,m2 )

n=1 N

1 N

n=1

N

n=1

∗ sk (tn + τk,m1 ) vm (tn ) 2

(8)

s∗k (tn + τk,m2 ) vm1 (tn )

∗ vm1 (tn ) vm (tn ). 2

Thus the expression of Em1 ,m2 = [E]m1 ,m2 can be obtained by combining (8) and (3) as follows, Em1 ,m2 N K K 1 = sk (tn + τk,m1 ) s∗k (tn + τk ,m2 ) N k=1 k =1

n=1

−E [sk (t + τk,m1 ) s∗k (t + τk ,m2 )]] N K 1 ∗ + sk (tn + τk,m1 ) vm (tn ) N 2 + +

k=1 K k=1 1 N

1 N

n=1 N n=1

N

n=1

s∗k (tn + τk,m2 ) vm1 (tn )

∗ vm1 (tn ) vm 2

(tn ) − σ δ (m1 − m2 ) . 2

(9) Denote the four items on the right hand side of (9) by υ1 , · · · , υ4 , they are stochastic due to the randomicity of the signal and noise amplitudes. Then it can be concluded from the zero-mean property and mutual independence of the signal and noise sequences that, E (Em1 ,m2 ) =

4

E (υi ) = 0.

(10)

i=1

It can also be concluded that Em1 ,m2 is Gaussian distributed due to the Gaussian distribution of the signal and noise amplitudes according to the law of large numbers when N is moderately large, and the second order statistic of Em1 ,m2 can be obtained via straightforward calculation that (with detailed


3899

derivation provided in the Appendix),

∗ E Em1 ,m2 Em ,m 1 2 1 ∗ Rm1 ,m1 (Δt) Rm = N1 (Δt) + O N2 , 2 ,m

facilitate the calculation of the SNR of the covariance vectors. The decorrelated vector is

(11)

where R (Δt) = E x (t + Δt) xH (t) , Rm1 ,m2 (Δt) = [R (Δt)]m1 ,m2 , and O N12 stands for a constant having a scaled magnitude of N12 with the scale being 0 when the incident signals are narrowband. Thus 1 E εm εH [R (Δt)]m,m R (Δt) + O N12 m = N

(16) and the estimation error of where Qm is given by (12),

H −1/2 −1/2 yˆm satisfies E Qm εm Qm εm = IM . The

2

Δt=nTs

Δ

Δt=nTs

= Qm , and 1 E εεH = N

−1/2

= Qm yˆm yˆm −1/2 −1/2 −1/2 = Qm A (ϑ) wm + σ 2 Qm em + Qm εm ,

component corresponding to the kth signal in the covariance −1/2 ∗ vector is Qm a (ϑk ) ηk ξk,m , and the stochastic perturbation −1/2 is Qm εm , thus the ASNR of the signal component with respect to the estimation perturbation is

(12) T

R (Δt) ⊗ R (Δt)+O

Δt=nTs

1 N2

Δ

= Q,

(13) where ⊗ represents the Kronecker product. It should be noted that, R(Δt) = δ (Δt) R and (13) can be simplified to E εεH = (1/N ) RT ⊗ R for narrowband signals, which tallies with the results in [22, Chp. 4], while R (Δt) has nonzero values for small Δt’s in wideband scenarios and the simplification no longer holds, e.g., it is nonzero for all |Δt| < 1/B if the signals are PN ones with code rate B [4], [21]. Nonetheless, Qm and Q have nonzero off-diagonal elements for both narrowband and wideband signals according to (12) and (13), thus it can be concluded that the estimation errors of different covariance vector elements are correlated. Strict analysis of such correlations is necessary to facilitate the implementation of any maximum likelihood or Bayesian method. Another point that should be noted is that, the perturbation-free entities of Qm , Q, R and σ 2 are actually unknown beforehand, but we use them directly during the theoretical analyses without any remark for notational convenience. The way for estimating those entities will be provided in the simulation section. B. SNR of the covariance vectors In the following, we take independent narrowband signals for example to analyze the SNR of the covariance vectors theoretically, and compare it with that of the raw array output. In such scenarios, the array SNR (ASNR) of the kth signal in the raw array outputs can be calculated based on (1) as ηk (14) ASNRk = 2 . σ The formulation of the mth covariance vector can be derived from (4) as follows, yˆm = Bm η +σ 2 em +εm = A (ϑ) wm +σ 2 em +εm ,

ASNRk = =

2 ηk H −1 M a 2 (ϑk ) Qm a (ϑk ) N ηk H −1 a (ϑk ) . MRm,m a (ϑk ) R

(17)

The ratio of the ASNR of the covariance vector to that of the raw array output can then be calculated based on (14) and (17) as follows, ρ=

ASNRk ηk σ 2 aH (ϑk ) R−1 a (ϑk ) . =N ASNRk Rm,m M

(18)

When multiple signals impinge simultaneously, it is difficult to give the explicit value of the ratio directly from (18), but the proportionality of it to the snapshot number holds without doubt. That is to say, when moderately many snapshots are collected, the ASNR of the covariance vector will surpass that of the raw array output. In order to make the degree of the ASNR improvement clearer, we further simplify the scenario to the single-signal case. Based on such simplification, one can easily conclude that R = η1 a (ϑ1 ) aH (ϑ1 ) + σ 2 IM and Rm,m = σ 2 + η1 , thus the ratio given in (18) can be rewritten as η1 σ 2 . (19) (σ 2 + η1 ) (σ 2 + M η1 ) √

2 (σ2 +η1 )(σ2 +Mη1 ) As ≥ M +1 and the equivalence η1 σ2 √ 2 holds when σ η1 = M , one can conclude from Eq.(19) that, improved obtained in the covariance vector √ ASNR is 2 M + 1 . The minimal value of N for such when N > improvement may vary with σ 2 η1 , but it exists for certain due to the proportionality between ρ and N . During the above analysis, we have taken the independent narrowband scenario for example for convenience. But as the magnitudes of the estimation errors of both narrowband and√wideband covariance vectors are inversely proportional to N , which is indicated by (12), similar conclusions on the ASNR improvement for different kinds of signals can be obtained as that for independent narrowband ones. ρ=N

(15)

where Bm = narrowband signals, A (ϑ) Φm for independent

T

∗ ∗ , wm = Φm η, A (ϑ), ξk,m Φm = diag ξ1,m , · · · , ξK,m and em are defined in the same way as those in (6), and εm is the estimation error. As the estimation errors of different covariance elements are correlated, a decorrelation process should be introduced to

IV. C OVARIANCE V ECTOR -BASED DOA E STIMATION In this part, we propose the SMV CV-RVM method to estimate the directions of independent narrowband/wideband ˆ and propose the MMV CV-RVM signals by reconstructing y, method to estimate the directions of correlated narrowband signals by reconstructing yˆ1 , · · · , yˆM jointly.

3900


A. independent narrowband/wideband DOA estimation In the independent signal scenarios, we remove the noise component from yˆ to make the relationship between the vector and the signal directions clearer. Denote the noise-removed ˆ i.e., counterpart of yˆ by z, zˆ = yˆ − σ 2 e˜ =

K

ηk b (ϑk ) + ε.

(20)

k=1

T where b (ϑk ) = bTk,1 , · · · , bTk,M with bk,m defined in the same way as that in (4). Eq.(20) indicates that zˆ is a weighted summation of the K signal-components of b (ϑk ) for k = 1, · · · , K besides the estimation error, thus the signal directions can be estimated if those vectors are recovered ˆ In the following, we introduce the SBL technique from z. [15] to recover those vectors and estimate the directions of independent narrowband/wideband signals. ˆ we first In order to recover the signal components from z, sample the potential space of the incident signals discretely to yield a direction set Θ = [θ1 , · · · , θI ] and forms the corresponding manifold dictionary B (Θ) = [b (θ1 ) , · · · , b (θI )], with b (θi ) constructed according to the formulation of b (ϑk ) in (20) by replacing ϑk with θi . For example, sampling the [-90o 90o ] space with interval Δθ=1o forms a set Θ = [−90◦ , −89◦ · · · , 90◦ ] and a dictionary accordingly, the direction set covers the possible space of the incident signals. Then zˆ can be rewritten in the following form, zˆ =

I

η¯i b (θi ) + ε = B (Θ) η¯ + ε,

(21)

i=1 T

where η¯ = [¯ η1 , · · · , η¯I ] is a zero-padded extension of T η = [η1 , · · · , ηK ] from ϑ to Θ, i.e., it has nonzero values only for θ ∈ ϑ. Actually, the discrete set Θ is never dense enough to include all the possibilities of the true source directions, but the grid mismatch will not deteriorate the validity of the overcomplete model significantly [4], [11]-[14]. In most practical array processing problems, it is satisfied that I M > K, thus the dictionary B (Θ) is overcomplete and (21) is a sparse model. The SBL technique is introduced to extract the basis set of [b (ϑ1 ) , · · · , b (ϑK )] from B (Θ) to approximate zˆ under a model parsimony constraint. We refer the interested readers to [14] for more detailed explanations for the predominance of the SBL technique in DOA estimation by exploiting the spatial sparsity of the incident signals. Based on the above overcomplete formulation of the covariance vector, we assume that η¯ is Gaussian distributed as η¯ ∼ N (0, Γ), with Γ = diag (γ) and γ = [γ1 , · · · , γI ], then the probability of zˆ with respect to γ is given as ˆ γ) p (z; ¯ γ) dη¯ = p (zˆ |η¯ ) p (η; −1 −1 = |πΣzˆ | exp −zˆH Σ ˆ ˆ z z ¯ − μ) dη¯ × |πΣη¯ |−1 exp − (η¯ − μ)H Σ−1 ¯ (η η ˆ , = |πΣzˆ |−1 exp −zˆH Σ−1 ˆ z z (22) where

−1 ˆ z, (23) μ = ΓB H (Θ) Q + B (Θ) ΓB H (Θ)

−1 B (Θ) Γ, Ση¯ = Γ − ΓB H (Θ) Q + B (Θ) ΓB H (Θ) (24) Σzˆ = Q + B (Θ) ΓB H (Θ) . (25) The coefficient vector of η¯ in (21) should be non-negative as it represents the spatial power distribution of the incident signals. However, since non-Gaussian assumptions generally greatly block Bayesian parameter estimation processes, we abandon this prior information and follows the guideline of ¯ Eq.(22) reveals SBL to append a Gaussian distribution to η. the relationship between zˆ and γ, and γ can be optimized by maximizing the likelihood function. After that, η¯ is calculated according to (23) and the indexes of its nonzero elements indicate the signal directions. Taking the logarithm of (22) and neglecting the constants yields the following objective function for optimizing γ, ˆ L (γ) = ln |Σzˆ | + zˆH Σ−1 ˆ z. z

(26)

The EM algorithm [23] can then be used to estimate γ by minimizing this objective function. During each EM iteration, the first- and second-order posterior moments of η¯ are calculated with (23) and (24) in the E-step, and γ is updated by minimizing L (γ) according to ∂L (γ)/∂γ = 0 in the M-step, which results in the following update strategy of γ,

(q) 2 (q) (q) γi = μi + Ση¯ , (27) i,i

where the superscript (•)(q) represents the qth iteration, (q) μ(q) and Ση¯ are calculated with (23) and (24) in the qth iteration, μi is the ith element of μ, and (Ση¯ )i,i is the (i, i)th element of Ση¯ . The update strategy given in (27) (q) = can be substituted with a fixed-point iteration as γi 2

(q) (q) (q−1) 1 − Ση¯ γi + ς to speed up the conμi i,i

vergence of the EM algorithm, with ς being a small positive value [14], [15]. The initialization and termination criteria of the EM algorithm are also set similarly as those in [14]. It should be noted that, the noise component of σ 2 e˜ can also be recovered from the covariance vector by taking yˆ as the measurement and combining e˜ into B (Θ) to form a dictionary of [B (Θ) , e˜], the reconstruction process follows the same guideline as the one presented above. In the reconstruction result, the coefficient of e˜ stands for the noise variance estimate. However, as such a model extension complicates the sparsity profile, we have concluded from sufficient empirical evidences that the extension deteriorates both the computational efficiency and the reconstruction performance of the method in most of the cases. Therefore, we choose to estimate the noise variance directly from the array output (with the estimation method given in Section VI), and remove the noise component from yˆ before the reconstruction process. Since the predefined direction set Θ is formed via discrete spatial sampling, notable quantization errors may be introˆ are duced into the DOA estimates if the peak locations of γ taken as the source directions directly. Therefore, we introduce a refined scanning process similar as that in [14] to improve the DOA estimation precision based on the reconstruction result.


Denote the estimates of γ and Σzˆ when the iterative reconstruction process is terminated by γ # and Σ# ˆ , respecz tively, the direction sets in Θ corresponding to the spectral lines associated with each signal by θ1 , · · · , θK , the hyper-parameter set associated with θk by γk , and define Θ−k = Θ\θk , i.e., removing θk from Θ yields Θ−k , γ−k = γ # \γk , Γ−k = diag (γ−k ), Γk = diag (γk ), and Σ−k = Q + B (Θ−k ) Γ−k B H (Θ−k ), then Σ−k can be deemed as the covariance matrix of the estimation error and the other K − 1 signal components in zˆ except the kth one. In order to obtain refined DOA estimates for the incident signals, we use a single spectral line of βk b (θ) bH (θ) to substitute the spectral peak of B (θk ) Γk B H (θk ) to denote the kth signal component in the covariance matrix. By introducing the new formulation into (26), one can obtain the following objective function for estimating ϑk , L (βk , θ) = ln Σ−k + βk b (θ) bH (θ)

−1 (28) ˆ z. + zˆH Σ−k + βk b (θ) bH (θ) The estimate of βk can be derived according to ∂L (βk , θ)/∂βk = 0 as follows, H zˆzˆ − Σ−k Σ−1 bH (θ) Σ−1 −k −k b (θ) βˆk = . (29) 2 bH (θ) Σ−1 −k b (θ) Then substituting (29) into ∂L (βk , θ)/∂θ = 0 yields the following equality, Δ H −1 H ˆzˆH g (θ) = Re b (θ) Σ−1 −k b (θ) b (θ) Σ−k z

−1 d[b(θ)] H −zˆzˆH Σ−1 b (θ) b (θ) Σ = 0. −k −k dθ (30) where b (θ) is constructed according to the formulation of b (ϑk ) in (20) by replacing ϑk with θ. In practice, this equality may not hold due to measurement perturbation or inaccurate signal reconstruction, and the refined DOA estimate of the kth signal should be obtained via 1-D scanning by checking the distance of g (θ) from 0, i.e., −1 ϑˆk = arg max |g (θ)| , θ∈Ωk

(31)

where Ωk represents the peak scope of the kth signal. B. correlated narrowband DOA estimation In the case of correlated narrowband signals, the signalcomponents have different weights in y1 , · · · , yM according to (6), thus they cannot be reconstructed uniformly in the same way as that for independent signals. However, as they share identical bases of a (ϑ1 ) , · · · , a (ϑK ), a joint reconstruction procedure can be introduced to recover the signal-components in those vectors and estimate the signal directions. When one reconstructs the covariance vectors of yˆ1 , · · · , yˆM jointly for DOA estimation, their estimation errors should be dealt with carefully. That is because the second-order statistic given in (13) indicates that the estimation errors of yˆ1 , · · · , yˆM are correlated with each other, thus they should be decorrelated as a whole before the reconstruction procedure. We use Q to decorrelate

3901

T T after removing the noise component to yˆ = yˆ1T , · · · , yˆM obtain the following covariance vector, zˆ = Q−1/2 yˆ − σ 2 e˜ = z + Q−1/2 ε, (32) where z = Q−1/2 vec (R) is the perturbation-free counterpart of zˆ , and the decorrelated perturbation satisfies

H −1/2 −1/2 Q ε Q ε (33) = IM 2 , E which means that the M M × 1 subvectors of Q−1/2 ε are to each other. Moreover, as Q−1/2 = √ independent −1/2 T N R ⊗ R−1/2 , the mth M × 1 subvector of z can be formulated as M √ = N cm,i R−1/2 yi − σ 2 ei zm i=1 (34) M √ −1/2 = NR A (ϑ) cm,i um , i=1

T where cm,i is the (m, i)th element of R−1/2 . ’s share identical sparsity Eq.(34) indicates that the zm profiles when decomposed on the overcomplete dictionary of R−1/2 A (Θ), with A (Θ) = [a (θ1 ) , · · · , a (θI )] defined similarly as B (Θ) in (21), thus the directions of narrowband correlated signals can be estimated by reconstructing M zˆ1 , · · · , zˆM jointly. Denote um = cmi um , A (Θ) = i=1 √ ¯ m is the zero-padded extension of um N R−1/2 A (Θ), u ¯ m ∼ N (0, Γ), then the update from ϑ to Θ, and assume u strategy for estimating γ can be derived similarly as that given in (27) as follows,

(q) 2 (q) (q) , (35) γi = Mi• M + ΣM 2

i,i

(q) ΣM

where M(q) and are the first- and second¯ ¯ 1 , · · · , u ¯ M ], = [u order posterior moments of U (q) H (q) (q−1) −1 ˆ = Γ A (Θ) Ψ Z and ΣM = with M H Γ(q−1) − Γ(q−1) A (Θ) Ψ−1 A (Θ) Γ(q−1) , in which H ˆ = [zˆ1 , ·· · , zˆM ] and Ψ = IM + A (Θ) Γ(q−1) A (Θ) , Z −1/2 yˆ − σ 2 e˜ . zˆm is the mth M × 1 subvector of zˆ = Q When the predefined terminating criterion of the EM algorithm is satisfied, the joint reconstruction result can be used to realize refined DOA estimation according to (31) with g (θ) given as H H −1 ˆ ˆ H g (θ) = Re a (θ) Σ−1 −k a (θ) a (θ) Σ−k Z Z ˆ H Σ−1 a (θ) a (θ)H Σ−1 d[a (θ)] , ˆ Z −Z −k −k dθ (36) √ where a (θ) = N R−1/2 a (θ) and Σ−k is defined similarly as its counterpart in (31). It should be noted that, when all the incident narrowband signals are coherent to each other, the matrices of R and Q are nearly singular when the SNR is high, thus the decorrelation process in (32) becomes unstable. In such scenarios, the matrix inverse lemma should be applied when calculating the inverse of R and Q. Take R for example, its inverse can be calculated as σ2 R − σ 2 IM , (37) R−1 = σ 2 IM − tr (R)

3902


which is derived by reformulating R as R = λ1 hhH +σ 2 IM , with λ1 + σ 2 and h being the largest eigenvalue of R and the eigenvector associated with it, respectively. V. T HEORETICAL L OWER B OUND AND M AXIMAL S EPARABLE S OURCE N UMBER FOR I NDEPENDENT S IGNALS In this section, we analyze the theoretical lower bound and the maximal separable source number of CV-RVM in the case of independent signals.

the maximal separable source number of CV-RVM can be analyzed based on a simplified structure of y. Take the linear arrays with uniform or minimum redundancy geometries [25] for example, suppose that the spacings between each two array elements form a set of {D0 , 2D0 , · · · , κD0 }, then the maximal separable source number of CV-RVM is determined by the property of the following vector when decomposed on the dictionary of Bdr (Θ), K ηk bdr (ϑk ), (40) ydr = k=1

A. theoretical lower bound In the case of independent narrowband and wideband signals, the measurement vector used for DOA estimation in CVRVM is given by (20), where the estimation error is Gaussian distributed with covariance matrix Q. Thus the likelihood of yˆ with respect to the unknown variables of ϑ, η and σ 2 is p yˆ ϑ, η, σ 2 H K −1 = |πQ| exp − yˆ − σ 2 e˜ − ηk b (ϑk ) (38) k=1 K ×Q−1 yˆ − σ 2 e˜ − ηk b (ϑk ) . k=1

Based on the likelihood function, the secondary data (i.e., the covariance vectors)-based Cramer-Rao lower bound (SdCRLB) of the DOA estimates can be derived similarly as that in [24, Appendix E] by setting the snapshot number to 1 and the coefficient vector to be real, SdCRLB−1 (ϑ)

˜ = 2diag (η) Re D H Q−1 D − Re D H Q−1 B

−1

˜ ˜ H Q−1 D ˜ H Q−1 B Re B × Re B diag (η) , (39) T ˜ ˜ , · · · , η ] , B = [B (ϑ) , e ], and D = where η = [η 1 K d[b(θ)] d[b(θ)] , · · · , dθ with b (θ) defined in the dθ θ=ϑ1

θ=ϑK

same way as that in (30). The SdCRLB makes trivial sense for narrowband signals, since the raw array output can be exploited directly to obtain the Cramer-Rao Lower Bound (CRLB) [24]. However, when the incident signals are wideband with exploitable temporal correlation property, the SdCRLB provides a good rule for evaluating the effectiveness of the proposed DOA estimator. B. maximal separable source number The maximal separable source number of CV-RVM depends on the non-ambiguity property of the equation y = K ηk b (ϑk ) + σ 2 e˜ during the reconstruction procedure. As k=1

R is conjugate symmetrical, its upper right elements can be represented by the lower left ones completely, and the diagonal elements are identical to each other and do not contain the directional information of the signals. Moreover, when the array has a special geometry such as the uniform linear ones, many duplicated elements are contained in y, and they do not contribute to the processing capability of CV-RVM. Therefore,

where ydr , with the subscribe (•)dr being the short of ”duplication removed”, equals y when the elements associated with the duplicated and diagonal elements (together with the noise-component σ 2 e˜) of R are removed, bdr (ϑk ) is a variant of b (ϑk ) by retaining the elements associated with ydr , and Bdr (Θ) = {bdr (θ) |θ ∈ Θ }. The formulation of the measurement vector given in (40) is identical with the Fourier coefficients in Corollary 1.2 of [26] in the case of narrowband signals, and is the same as the vector in Theorem 1 of [4] for wideband signals. Thus the propositions given in those literatures can be deduced to conclude that, the maximal number of independent signals that CV-RVM can separate is κ, which depends on the array geometry and can be as large as M 2 − M 2, where M 2 represents the dimension of y, −M means excluding the diagonal elements, and the divisor of 2 means removing the duplicated information in the lower-left and upper-right covariance matrices. The upper bound can be achieved in welldesigned arrays. For example, the 3-element and 4-element minimum redundancy arrays [25] are able to separate 3 and 6 independent signals, respectively, according to the analysis, which equal to or even exceed the sensor number. VI. S IMULATION R ESULTS In this section, we carry out simulations to demonstrate the performance of the proposed method in independent narrowband/wideband and correlated narrowband DOA estimation. The SdCRLB given in (39) will be used for independent wideband signals to evaluate the effectiveness of the methods, while the CRLB given in [24] is used for narrowband signals. As the covariance matrices of R (Δt)’s for both zero and nonzero Δt’s are not available in practice, their esˆ (Δt) = timates derived from the array outputs as R N 1 x (n) xH (n − Δn) with Δn = Δt/Ts will N −Δn n=Δn+1

be used instead, and the matrix Q is calculated with those estimates approximately, i.e., ˆ= 1 ˆ T ((Δn) Ts ) ⊗ R ˆ ((Δn) Ts ). Q R (41) N |Δn|