IEEE TRANSACTIONS ON SIGNAL PROCESSING
1
Multidimensional Frequency Estimation with Finite Snapshots in the Presence of Identical Frequencies Jun Liu, Student Member, IEEE, Xiangqian Liu, Member, IEEE, and Xiaoli Ma, Member, IEEE
Abstract— Recently an eigenvector-based algorithm has been developed for multidimensional frequency estimation with a single snapshot of data mixture. Unlike most existing algebraic approaches that estimate frequencies from eigenvalues, the eigenvector-based algorithm achieves automatic frequency pairing without joint diagonalization of multiple matrices, but it fails when there exist identical frequencies in certain dimensions because eigenvectors are not linearly independent anymore. In this paper, we develop an eigenvector-based algorithm for multidimensional frequency estimation with finite data snapshots. We also analyze the identifiability and performance of the proposed algorithm. It is shown that our algorithm offers the most relaxed statistical identifiability condition for multidimensional frequency estimation from finite snapshots. More important, we introduce complex weighting factors so that the algorithm is still operational when there exist identical frequencies in one or more dimensions. Furthermore, the weighting factors are optimized to minimize the mean-square errors of the frequency estimates. Simulation results show that the proposed algorithm offers competitive performance when compared to existing algebraic approaches but at lower complexity. Index Terms— Frequency estimation, eigenvalue decomposition, multidimensional signal processing, identifiability, perturbation analysis
I. I NTRODUCTION In radar, signal processing, and communications, many parameter estimation problems can be reduced into twodimensional (2-D) or in general N -dimensional (N -D) frequency estimation, such as 2-D direction estimation [26], joint frequency and 2-D angle estimation [25], joint angle and delay estimation [23], joint angle and frequency estimation [10], and multidimensional parameter estimation in wireless channel sounding [14], [21]. In these problems, it is desirable to estimate multiple N -D frequency components, each with N frequencies, from several snapshots of observation data. High resolution subspace-based parametric methods based on eigenvalue decomposition (EVD) have been well-studied for N -D frequency estimation, such as N -D MUSIC and N -D Unitary ESPRIT (see e.g., [5], [14], [17]). Manuscript received January 30, 2007; revised April 12, 2007. This work was supported in part by the U.S. Army Research Laboratory and the U.S. Army Research Office under grant number W911NF-05-1-0485. Parts of this work were presented at the 7th IEEE Workshop on Signal Processing Advances in Wireless Communications, Cannes, France, July 2-5, 2006, and the 4th IEEE Workshop on Sensor Array and Multi-channel Processing, Waltham, MA, July 12-14, 2006. The associate editor coordinating the review of this paper and approving it for publication was Dr. Peter H¨ andel. J. Liu and X. Liu are with Dept. of Electrical and Computer Engineering, University of Louisville, Louisville, KY 40292 USA (e-mail:
[email protected],
[email protected]) X. Ma is with School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (email:
[email protected]).
One of the important problems for parametric frequency estimation methods is to determine the maximum number of resolvable frequency components for a given sample size in the noiseless case, which is referred to as the identifiability bound herein. The identifiability bound is of importance in situations where data samples come at a premium, for example, in spatial sampling for direction estimation using an antenna array, and in time sampling for time-varying parameter estimation in wireless channel sounding. Recently much progress has been made on the statistical identifiability of multidimensional frequency estimation from a single snapshot of data [9], [13], [15]. For example, in the 2-D case, the MDF algorithm [15] can uniquely estimate up to 0.25M1 M2 2-D frequencies almost surely from a single snapshot of M1 ×M2 data mixture, provided that the 2-D frequencies are drawn from a continuous distribution (hence the name of statistical identifiability). We have shown in [13] that the improved MDF (IMDF) algorithm can retrieve up to approximately 0.34M1 M2 2-D frequencies for the same data size. The aforementioned identifiability results concern the single snapshot case. Few results were found in the multiple snapshot case. When multiple snapshots are available, most existing algebraic approaches estimate frequencies from the sample covariance matrix. A sufficient large number of snapshots usually need to be collected in order to calculate the sample covariance matrix. For example, it is assumed that the number of snapshots is greater than the data size of one snapshot for the MUSIC algorithm in 1-D frequency estimation [22]. However, in many cases it is possible that only a few snapshots may be collected. For example, in multiple target tracking, only a few snapshots of data may be available for a fixed 2-D DOA if the targets move fast. In these circumstances, the study of the identification condition for finite snapshots becomes important. For instance, in [11], an identifiability bound is obtained for the estimation of multiple constant modulus signals from finite snapshots. Another challenge of N -D frequency estimation is the frequency association or frequency pairing problem, which concerns how to associate the N frequencies of the same N D frequency component. If N -D MUSIC is implemented by N -D grid search, no frequency association step is needed, but N -D grid search is very complex. Alternatively, 1-D frequency estimation can be performed in individual dimensions of the N -D MUSIC spectrum followed by a frequency pairing step [8], [17]. On the other hand, the Unitary ESPRIT algorithm uses an iterative Jacobi-type algorithm for approximate joint diagonalization to achieve automatic frequency pairing [5]. Recently, as an alternative to existing algebraic algorithms that
2
estimate frequencies from eigenvalues, an eigenvector-based algorithm (the IMDF algorithm) is developed in [13] for N -D frequency estimation with a single snapshot of data mixture. The IMDF algorithm achieves automatic frequency pairing without using joint diagonalization since the frequencies of an N -D frequency component are estimated from the same column of a structural matrix, and thus the computational complexity is reduced. The transformation matrix between the structural matrix and the signal subspace is obtained from the eigenvectors of a matrix pencil [13]. However, the IMDF algorithm fails when there exist identical frequencies in the first dimension, because in this case, the corresponding eigenvectors are not linearly independent any more. Note that eigenvalue based approaches do not have this drawback. In this paper, we develop an eigenvector-based algorithm for N -D frequency estimation with multiple snapshots, in the presence of possible identical frequencies in one or more dimensions. Our approach is based on the IMDF algorithm, but it is not a trivial extension of the IMDF algorithm to the multiple snapshot case. Similar to IMDF, our proposed algorithm estimates frequencies from a structural matrix that is obtained from the eigenvectors. More important, our novel contribution in this algorithm is two-fold: most relaxed identifiability bound for N -D frequency estimation with finite snapshots, and optimal weighting factors to deal with identical frequencies in one or more dimensions. Intuitively, the maximum number of resolvable frequency components should increase as the number of snapshots increases until it reaches a threshold. We prove that when the number of snapshots is smaller than a threshold (approximately M/2), the identifiability bound of our N -D frequency √ estimation algorithm is approximately 2T M/(1 + N 2T )N , where T is the number of snapshots, and M is the sample size of one snapshot. When the number of snapshots is greater than the threshold, both our approach and those based on sample covariance matrix give the same identifiability bound, which is approximately M . To deal with possible identical frequencies in one or more dimensions, we introduce complex weighting factors in our eigenvector-based approach to ensure that the eigenvalues of the matrix pencil obtained from the signal subspace are distinct, so that eigenvectors are linearly independent. Similar factors have been adopted in the (eigenvalue-based) 2-D ESPRITtype algorithm [19]. However, here we derive the theoretic mean-square errors (MSEs) of the frequency estimates, and we optimize the weighting factors by minimizing the MSEs, while the weighting factors in [19] are chosen randomly. Numerical simulation results demonstrate that the proposed approach is effective in improving the performance of eigenvector-based N -D frequency estimation in the presence of identical frequencies, and optimal weighting factors significantly outperform random weighting factors since randomly chosen factors can not guarantee that the MSEs are minimized in the noisy case. The rest of the paper is organized as follows. The data model and several preliminary lemmas for identifiability argument are introduced in Section II. In Section III we present the eigenvector-based approach for N -D frequency estimation from finite data snapshots. Section IV gives the statistical
IEEE TRANSACTIONS ON SIGNAL PROCESSING
identifiability bound of the proposed algorithm. The performance of proposed algorithm in noisy case is analyzed in V, where the theoretic MSEs are derived, and based on which, an approach to obtain optimal weighting factors are proposed in Section VI. Simulation results are discussed in Section VII and conclusions are drawn in Section VIII. In the following, upper (lower) bold face letters are used for matrices (column vectors). A∗ , AT , AH , and A† denote the conjugate, transpose, Hermitian transpose, and pseudo-inverse of A, respectively. We will use ⊗ for the Kronecker product, for the Khatri-Rao (column-wise Kronecker) product,
a1
a2
· · · aN b1 b2 · · · := a1 ⊗ b1 a2 ⊗ b2 · · ·
bN
aN ⊗ bN .
We also use I p for a p × p identity matrix, 0M ×N for an M × N zero matrix, D(a) for a diagonal matrix with a as its diagonal, A(m) for a sub-matrix of A formed by its first m rows, k·k for the l2 norm, ai,j or [A]i,j for the (i, j)-th element of A, E(·) for expectation, C for the complex field, O for the set (−π, π], and R(·) and I(·) for the real and imaginary parts respectively. II. DATA M ODEL The data model for N -D frequency estimation can take a variety of forms. In this paper, we refer to a single snapshot N -D frequency mixture as an N -D array X with xm1 ,m2 ,··· ,mN =
F X
f =1
cf
N Y
ejωf,n (mn −1) + wm1 ,m2 ,··· ,mN ,
n=1
(1) where mn = 1, . . . , Mn , for n = 1, . . . , N , and Mn is the sample size QNof the n-th dimension. The total sample size is M := n=1 Mn . In (1), the frequencies ωf,n ∈ O, for f = 1, . . . , F , n = 1, . . . , N , and wm1 ,m2 ,··· ,mN is white Gaussian noise with variance σ 2 . Similarly T snapshots of N D frequency data mixtures may be modeled as T N -D arrays, X(t), with xm1 ,m2 ,··· ,mN (t) =
F X
f =1
cf (t)
N Y
ejωf,n (mn −1)
n=1
+ wm1 ,m2 ,··· ,mN (t),
(2)
for t = 1, · · · , T , where t is the snapshot index, which can be a time index, or trial index in case of multiple trials of experiments. T = 1 corresponds to the single snapshot case. The frequency set {ωf,n }N n=1 is an N -D frequency component, and there are F such components. The objective of N -D frequency estimation is to estimate {ωf,n }N n=1 , for f = 1, . . . , F , from X(t), t = 1, · · · , T . Notice that the same data model for the case of multiple snapshots has been used in [5], [17]. In order to facilitate the presentation, we introduce the equivalent data models using the notation of Khatri-Rao product. Given (2), define the sample vector x(t) for the t-th
LIU ET AL.: MULTIDIMENSIONAL FREQUENCY ESTIMATION WITH FINITE SNAPSHOTS
snapshot as
where `n = 1, . . . , Ln , and Kn and Ln are positive integers satisfying Kn + Ln = Mn + 1 for n = 1, . . . , N . Further define an N -D smoothing operator for the snapshot vector in (3) as h S[x(t)] := J 1,1,··· ,1 x(t) J 1,1,··· ,2 x(t) · · · J 1,1,··· ,LN x(t) i (7) J 1,1,··· ,2,1 x(t) · · · J L1 ,L2 ,··· ,LN x(t) ,
x1,1,··· ,1 (t) x1,1,...,2 (t) .. . x(t) = x1,1,...,MN (t) . x1,1,...,2,1 (t) .. . xM1 ,M2 ,...,MN (t)
then it can be verified that in the absence of noise
Furthermore, define N Vandermonde matrices An ∈ CMn ×F with generators {ejωf,n }F f =1 such that An := [a1,n a2,n · · · aF,n ] , iT h af,n := 1 ejωf,n · · · ej(Mn −1)ωf,n ,
n = 1, . . . , N.
It can be verified that
x(t) = Ac(t) + w(t),
(3)
t = 1, . . . , T,
where w(t) is the noise vector, and A := A1 A2 · · · AN , T c(t) := c1 (t) c2 (t) · · · cF (t) .
Define
where
X := x(1) x(2) · · · x(T ) ∈ CM ×T , C := c(1) c(2) · · · c(T ) ∈ CF ×T ,
then the data model in (3) can be rewritten in matrix form as (4)
X = AC + W ,
where W is the corresponding noise matrix. We need the following lemmas later. Lemma 1 is used to prove Lemma 4, which is a key result used in our algorithm. Lemma 3 connects N -D data smoothing to the Khatri-Rao product of N Vandermonde matrices. Lemmas 1–3 are available in the literature, but reproduced here for convenience. Lemma 1: Consider an analytic function h(x) of vector with complex variable elements x = [x1 , · · · , xN ]T ∈ CN ×1 . If h is nontrivial in the sense that there exists x0 ∈ CN ×1 such that h(x0 )6=0, then the zero set of h(x): Z := {x ∈ CN ×1 |h(x) = 0}
is of measure (Lebesgue measure in CN ×1 ) zero. Proof: See Lemma 2 of [9]. Lemma 2: Given N Vandermonde matrices An ∈ CMn ×F for n = 1, . . . , N ≥ 2, the rank of A = A1 A2 · · · A N
QN is min( n=1 Mn , F ) almost surely, if the N F generators {ejωf,n }F f =1 of An , n = 1, . . . , N , are drawn from a continuous distribution with respect to the Lebesgue measure in CN F ×1 . Proof: See Theorem 2 of [9]. Lemma 3: Define a set of N -D selection matrices as J `1 ,`2 ,··· ,`N := := 0Kn ×(`n −1)
n JK `n
⊗
2 JK `2
I Kn
··· ⊗
N JK `N ,
X S (t) :=S[x(t)] (K ) (K ) (K ) = A1 1 A 2 2 · · · A N N · T (L ) (L ) (L ) D c(t) A1 1 A2 2 · · · AN N (8) Proof: See Lemma 2 of [13]. Here we give an example of the smoothing operator in the 2-D case. Suppose that x is a snapshot of 2-D observed data in (3), then the smoothed data matrix S (x) in (7) is equivalent to X1 X2 ··· X L1 X2 X3 · · · X L1 +1 S (x) = . , . .. . . . . . . . . X K1
1 JK `1
3
0Kn ×(Ln −`n ) ,
(5) (6)
Xk =
X K1 +1
···
X M1
xk,1 xk,2 .. .
xk,2 xk,3 .. .
··· ··· .. .
xk,L2 xk,L2 +1 .. .
xk,K2
xk,K2 +1
···
xk,M2
.
Lemma 3 claims that T (K ) (K ) (L ) (L ) S (x) = A1 1 A2 2 D (c) A1 1 A2 2 .
Lemma 4: Given N Vandermonde matrices An ∈ CMn ×F , with generators {ejωf,n }F f =1 , for n = 1, . . . , N , and a complex F ×T matrix C ∈ C , define CT B := , (9) ΠT C H D(β)
where ΠT is a T × T permutation matrix with ones on its anti-diagonal, and T β = e−jβ1 e−jβ2 · · · e−jβF , βf =
N X
n=1
(Mn − 1)ωf,n .
(10)
If the N F frequencies {ωf,n }N n=1 , f = 1, · · · , F , and the T F elements of C, are drawn from distributions that are continuous with respect to the Lebesgue measure in ON F ×1 and CT F ×1 , respectively, then the rank of the matrix f := A(L1 ) A(L2 ) · · · A(LN ) B H 1 2 N n Q o N is min 2T n=1 Ln , F almost surely. Proof: See Appendix A.
(11)
4
IEEE TRANSACTIONS ON SIGNAL PROCESSING
III. T HE E IGENVECTOR - BASED A LGORITHM FOR N -D FREQUENCY ESTIMATION
In this section we present the algorithm for N -D frequency estimation from multiple snapshots. For simplicity of exposition, the algorithm is developed in the noiseless case. The identifiability of the algorithm is discussed in Section IV and the noise effect on the performance of the algorithm is analyzed in Section V. Given (3) in the noiseless case, we can apply the smooth operator S defined in Lemma 3 to every snapshot x(t), and obtain X S (t) := S[x(t)] = GD c(t) H T , (12) where
(K1 )
G := A1 H :=
(L ) A1 1
(K2 )
A2
(L ) A2 2
(KN )
· · · AN
···
,
(L ) AN N .
The positive integers Kn and Ln , n = 1, · · · , N , are chosen such that Kn + Ln = Mn + 1,
1 ≤ n ≤ N.
(13)
To further explore the data structure, we can perform the forward-backward smoothing on the data vector x(t) in (3). Define y(t) := ΠM x∗ (t), (14) where ΠM is an M ×M backward permutation matrix. It can be verified that y(t) = Ae c(t), where e c(t) = [e c1 (t), e c2 (t), · · · , e cF (t)]T with e cf (t) = ∗ −jβf cf (t)e , and βf is defined in (10). Applying the same technique to y(t) that we used to construct X S (t) from x(t), we obtain Y S (t) := S[y(t)] = GD e c(t) H T . We then collect all the smoothed data matrices to obtain h f := X S (1) X S (2) · · · X S (T ) Y S (T ) X i Y S (T − 1) · · · Y S (1) . (15)
f is of rank F almost surely. The singular value decomposiX f yields tion (SVD) of X f = U s Σs V H , X s
(17)
U s = GT −1 .
(18)
where U s has F columns that together span the column space f Since the same space is spanned by the columns of G, of X. there exists an F × F nonsingular matrix T −1 such that
Similar to the IMDF algorithm [13], once the signal subspace U s is obtained, we can construct two matrices from U s whose general eigenvalues are the exponentials of the first dimension, and then use the general eigenvectors to estimate N -D frequencies. However, as mentioned before, the IMDF algorithm fails when there exist identical frequencies in the first dimension since the eigenvectors are not linearly independent anymore. Furthermore, it has been shown in [13] that the performance of the IMDF algorithm degrades if there are close frequencies in the first dimension. To address this problem, in the following, we present a method to construct two matrices whose general eigenvalues are weighted sums of the N -D exponentials. The N -D frequencies are still resolved from the general eigenvectors. We define two selection matrices J 1 and J 2 as J 1 := J 1,1 ⊗ J 1,2 · · · ⊗ J 1,N ,
J 2 :=
N X
n=1
where
e 2,n , (19) αn J
J 1,n = I Kn −1 0(Kn −1)×1 , J 2,n = 0(Kn −1)×1 I Kn −1 , e 2,n = (J 1,1 ⊗ · · · ⊗ J 1,n−1 ) ⊗ J 2,n ⊗ (J 1,n+1 ⊗ · · · ⊗ J 1,N ) . J Here {αn }N n=1 are complex weighting factors, which can be randomly chosen initially. As we will show in Section V, the MSEs of the frequency estimates are affected by these weighting factors in the noisy case. Section VI discusses how to optimize these weighting factors to minimize MSEs. Next, we obtain two equal-sized matrices U 1 and U 2 by U 1 := J 1 U s ,
U 2 := J 2 U s .
(20)
QN QN f Define K := n=1 Kn and L := n=1 Ln . The size of X is K × 2T L. It can be verified that
According to the property of Khatri-Rao product [1]: (A ⊗ B)(C D) = AC BD, it can be verified that
f are defined in (9) and (11), respectively. where B and H f to A key step of our algorithm is the construction of X ensure that it is of rank F almost surely. In (16), since G is the Khatri-Rao product of multiple Vandermonde matrices, invoking Lemma 2, G has full column rank almost QN QN surely if K ≥ F . According to Lemma 4, if 2T n n=1 n=1 Ln ≥ F , f has full column rank almost surely. Based on the Sylvester’s H inequality T T f − F ≤ rank GH f rank(G) + rank H o n T f , ≤ min rank(G), rank H
where P = A1 1 A2 2 · · · AN N . It is clear QN that P has full column rank almost surely if F ≤ n=1 (Kn − 1). In (21), we have
T
f = G(H B)T = GH f , X
(16)
U 1 = P T −1 , (K −1)
U 2 = P D(ζ)T −1 .
(K −1)
ζ := [ζ1 , ζ2 , · · · , ζF ]T , ζf =
(21)
(K −1)
N X
αn ejωf,n .
n=1
We can resolve T from the matrix pencil U 1 and U 2 via leastsquares or total-least-squares approaches [20]. Here, we use the least-squares approach. T is retrieved from the following EVD up to column permutation and scaling ambiguity. U †1 U 2 = T D(ζ)T −1 .
(22)
LIU ET AL.: MULTIDIMENSIONAL FREQUENCY ESTIMATION WITH FINITE SNAPSHOTS
Clearly we can choose {αn }N n=1 to ensure that the elements of ζ are distinct even if there exist identical frequencies in one or more dimensions, but this is not guaranteed by randomly generated {αn }N n=1 . We will discuss how to choose the weighting factors in Section VI. Suppose that the EVD of U †1 U 2 gives T sp = T Λ∆, where Λ is a nonsingular diagonal column scaling matrix and ∆ is a permutation matrix. Once we obtain T sp , we can retrieve P up to column permutation and scaling ambiguity according to (23)
P sp = U 1 T sp = P Λ∆.
Notice that P is the Khatri-Rao product of N Vandermonde matrices, and there are F columns in P . The N frequencies of the same N -D component appear in the same column of P . In other words, for a fixed f , {ωf,n }N n=1 appear in the same column of P . Thanks to this structure, we can obtain F N -D frequency components by dividing suitably chosen elements of the aforementioned columns of P sp . Therefore the column scaling and permutation will not have a material effect on the algorithm. For this reason, we drop subscript “sp ” from now on as long as it is clear from the context. Suppose that jωf,n {ejωf,n }N n=1 appear in the f -th column of P . Then, e can be obtained by anyone of the following quotients ejωf,n =
pk,f , pk−Kn0 ,f
0 mod (k − 1, Kn−1 ) ≥ Kn0 ,
(24)
for f = 1, . . . , F , where 1 ≤ k ≤ K00 , pk,f is the (k, f )-th element of P , and QN 0≤n≤N − 1, p=n+1 (Kp − 1), Kn0 := 1, n = N. Notice that the frequencies are automatically paired because the frequencies (ωf,n , n = 1, . . . , N ) of the same N -D component (the f -th component) are obtained from the same column of P . If the data observations are noisy as given in (3), applying b. the above algorithm we can obtain the estimate of P as P In order to reduce the MSEs of frequency estimates, we use the average of all the quotients in (24) to obtain an estimate of the N -D exponential. In other words, ejωf,n is estimated by the following average 0
d f,n ejω
1 = µn
K0 X
pˆk,f
k=1 0 mod k−1,K 0 ≥Kn n−1
(
pˆk−Kn0 ,f
,
n = 1, . . . , N,
)
(25) where µn = K00 (Kn − 2)/(Kn − 1). The average is also the so-called “circular mean” in direction statistics [16]. Finally the frequency estimates are obtained by d f,n . (26) ω bf,n = I ln ejω
After the frequency estimates are obtained, the amplitude matrix C can be obtained by solving (4) using a least-squares approach.
5
IV. S TATISTICAL I DENTIFIABILITY The identifiability bound provides a sufficient identification condition for the N -D frequency estimation problem. The higher the identifiability bound an algorithm can achieve, the more relaxed the sufficient condition is. We will show that our proposed algorithm offers a higher identifiability bound compared with existing ones. We summarize our main result on statistical identifiability for the proposed algorithm in the following theorem. Theorem 1: Given T snapshots of sums of F N -D exponentials as in (2), in the absence of noise, the parameter T set ({ωf,n }N n=1 , {cf (t)}t=1 ), f = 1, . . . , F , is almost surely uniquely identifiable by the weighted IMDF algorithm given in Section III, provided that ! N N Y Y min F ≤ max (Kn − 1), 2T Ln , (27) Kn +Ln =Mn +1 1≤Kn ≤Mn 1≤n≤N
n=1
n=1
where the N F frequencies {ωf,n }N n=1 , f = 1, . . . , F , and T F complex amplitudes {cf (t)}Tt=1 , f = 1, . . . , F , are assumed to be drawn from distributions that are continuous with respect to the Lebesgue measure in ON F ×1 and CT F ×1 , respectively. Proof: In Section III we have shown that F N -D frequency components can be uniquely recovered almost surely, provided that N Y
n=1
Kn ≥ F, 2T
N Y
n=1
Ln ≥ F,
N Y
n=1
(Kn − 1) ≥ F,
where Kn and Ln , n = 1, · · · , N , are positive integers subject to the constraint that Kn + Ln = Mn + 1, 1 ≤ Kn ≤ Mn , 1≤n≤N. QN QN Since n=1 (Kn − 1) ≥ F implies n=1 Kn ≥ F , if (27) holds, F N -D frequency components are uniquely identifiable almost surely from (2). This concludes the proof. It is difficult to find a closed form solution for the right hand side (RHS) of (27). But we can find its upper bound, which is given in Lemma 5. Defining the following function ΓT (M1 , · · · , MN ) :=
max
Kn +Ln =Mn +1 1≤Kn ≤Mn 1≤n≤N
min
N Y
n=1
Kn , 2T
N Y
Ln ,
n=1
(28) then the RHS of (27) is ΓT (M1 − 1, · · · , MN − 1). QN Lemma 5: If T ≤ n=1 (Mn − 1)/2, then $ % 2T M ΓT (M1 − 1, M2 − 1, · · · , MN − 1)≤ , √ N (1 + N 2T ) QN where M is defined in (1), and if T ≥ n=1 (Mn −1)/2, then ΓT (M1 − 1, M2 − 1, · · · , MN − 1) =
N Y
n=1
!
(Mn − 1).
PN Proof: Define ρn = Ln /Mn , n = 1, . . . , N , and ρ = n=1 ρn /N . It is clear that ρ ∈ (0, 1]. If Kn , n = 1, . . . , N
6
IEEE TRANSACTIONS ON SIGNAL PROCESSING
are all real numbers, min
N Y
n=1
=
It can be shown that
(Kn − 1), 2T
N Y
n=1
Mn min
N Y
Ln
n=1 N Y
n=1
!
ΓT (M1 , · · · , Mi − 1, · · · , MN )
> ΓT (M1 − 1, · · · , Mi − 1, · · · , MN − 1),
(1 − ρn ), 2T
N
N
N Y
ρn
n=1
and thus the identifiability bound in (29) is higher than that in (27). When T = 1, the identifiability bound in (29) is that of the IMDF algorithm in [13]. If the frequencies in all N dimensions are distinct, we can construct matrix pencil U 1 and U 2 along any dimension. However, the following lemma shows that it is advantageous to construct a matrix pencil along the dimension with the largest dimension size. Lemma 6: If Mi ≥ Mj , then
≤ M min (1 − ρ) , 2T ρ ( M 2T ρN , 0 < ρ≤ N1√ , 1+ 2T = 1√ M (1 − ρ)N , ≤ρ≤1, N 1+
2T
2T M √ . ≤ (1 + N 2T )N
The two equalities hold when Ln = MN√n , n = 1, . . . , N . 1+ 2T Since Ln , n = 1, . . . , N , and ΓT (M1 , · · · , MN ) are all 2T √M integers, the exact value of may not be achieved, N (1+ 2T )N and thus 2T M √ . ΓT (M1 − 1, · · · , MN − 1) ≤ (1 + N 2T )N QN When T ≥ n=1 (Mn − 1)/2, since Ln ≥ 1, Mn ≥ Kn for n = 1, . . . , N , we have
ΓT (M1 , · · · , Mi − 1, · · · , MN )
≥ ΓT (M1 , · · · , Mj − 1, · · · , MN ). Proof: The proof is similar to that of Proposition 2 in
[13]. Remark 2: We note that a similar data smoothing approach is also used in the Unitary ESPRIT algorithm [5]. However, it was not proved that the smoothed data matrix is of rank F . Furthermore it was not pointed out in [5] that the N -D preprocessing step can be applied when T > 1. Although the identifiability bound was considered in a special 3-D frequency estimation case when M1 and M2 are fairly small and smoothN N N N N Y Y Y Y Y ing is only performed in the third dimension, the identifiability 2T Ln ≥ (Mn −1) Ln ≥ (Mn −1)≥ (Kn −1). bound in a general case is not given in [5]. Actually, if N -D n=1 n=1 n=1 n=1 n=1 data smoothing is applied in the multiple snapshot case, it can QN Therefore ΓT (M1 − 1, · · · , MN − 1) = n=1 (Mn − 1) when be shown that the identifiability bound of the Unitary ESPRIT QN algorithm is ΓT (M1 , · · · , Mn − 1, · · · , MN ), where Mn = T ≥ n=1 (Mn − 1)/2. It can be seen from Lemma 5 that when the number of min1≤i≤N Mi . From Lemma 6, we conclude the identifiability snapshots T = 1, the identifiability bound is approximately bound of the Unitary ESPRIT algorithm is less than or equal to 2M √ (which is approximately 0.34M1 M2 in the 2-D that in (29), but is greater than that in (27). To our knowledge, (1+ N 2)N (29) gives the most relaxed identifiability bound to date for N QN case); when 1 ≤ T ≤ n=1 (Mn − 1)/2, the identifiability D frequency estimation from finite snapshots. 2T √M , which increases as T bound is approximately N As a special case, consider N = 2. In this case, the (1+ 2T )N QN increases; when T ≥ n=1 (Mn − 1)/2, the identifiability identifiability condition (27) becomes QN bound becomes a constant n=1 (Mn − 1) (approximately min {(K1 − 1)(K2 − 1), 2T L1 L2 } . F ≤ max K1 +L1 =M1 +1 the data sample size of one snapshot), and thus no data K2 +L2 =M2 +1 smoothing is necessary in this case from the perspective of (30) maximizing identifiability bound since Ln = 1, Kn = Mn , The identifiability condition (29) becomes (assuming i = 1) for n = 1, . . . , N . F ≤ max min {(K1 − 1)K2 , 2T L1 L2 } . (31) Remark 1: In some applications, if it is known a priori that K1 +L1 =M1 +1 K +L =M +1 2 2 2 the frequencies in some dimension, say the i-th dimension, are distinct, then the statistical identifiability bound can be The identifiability condition of the Unitary ESPRIT algorithm further relaxed. In this case, it is not necessary to use random [5] is weighting factors in (19) to ensure the eigenvalues of the max min {(K1 − 1)K2 , K1 (K2 − 1), 2T L1 L2 } . matrix pencil distinct. In fact, we can construct a matrix pencil F ≤ along the i-th dimension, the selection matrices in (19) can be replaced by J 1 := I K1 ⊗ · · · ⊗ I Ki−1 ⊗ J 1,i ⊗ I Ki+1 ⊗ · · · ⊗ I KN , J 2 := I K1 ⊗ · · · ⊗ I Ki−1 ⊗ J 2,i ⊗ I Ki+1 ⊗ · · · ⊗ I KN .
All the remaining steps are the same as those given in Section III. In this case, the identifiability condition (27) becomes F ≤ ΓT (M1 , · · · , Mi − 1, · · · , MN ).
(29)
K1 +L1 =M1 +1 K2 +L2 =M2 +1
(32) In Fig. 1 (a), we plot the identifiability bound in (30) versus dimension size with M1 = M2 . It can be seen that the identifiability bound generally increases with the increase of snapshot size and the number of snapshots. In Fig. 1 (b), we plot the identifiability bounds in (30), (31) and (32) versus the number of snapshots, respectively. It can be seen that when the number of snapshots T is less than a threshold (approximately M/2), as T increases, the bounds increase like stairs. When T
LIU ET AL.: MULTIDIMENSIONAL FREQUENCY ESTIMATION WITH FINITE SNAPSHOTS
is greater than the threshold, the bounds become approximately M . For example, for the identifiability condition given in (31), when T ≥ (M1 − 1)M2 /2, the bound becomes (M1 − 1)M2 . V. P ERFORMANCE A NALYSIS In this section, we use perturbation analysis to derive the theoretical MSEs for the proposed algorithm in the noisy case. We need the following two assumptions for our analysis. f 1) In the noiseless case, the nonzero singular values of X † in (17) are distinct, and the eigenvalues of U 1 U 2 in (22) are distinct. 2) The eigenvectors in T of (22) are normalized, and the permutation matrix ∆ = I since permutation does not affect the MSEs. c f=X f + ∆X, f Suppose that the perturbed data matrix is X f where ∆X is the perturbation term, which is related to the noise term in (4), W , such that (c.f., (15)) h f = S[w(1)] · · · S[w(T )] ∆X i S[ΠM w∗ (T )] · · · S[ΠM w∗ (1)] , (33)
f in the noiseless case be Let the SVD of X f = U s Σs V H + U n Σ n V H , X s
n
(34)
where the vectors in U s , associated with the F non-zero singular values, span the signal subspace, while the vectors in U n , associated with the zero singular values, span the orthogonal subspace of U s . It is clear that Σn = 0. The c f=X f + ∆X, f whose SVD is given perturbed data matrix is X by c b n Vb H . b s Vb H + U b nΣ f=U b sΣ (35) X n
s
b s = U s + ∆U s . In The perturbed signal subspace is U Appendix B, we give a first-order approximation of ∆U s as a function of w(t), t = 1, . . . , T . Suppose that the frequencies c f are ω estimated from X ˆ f,n , which is written as ω ˆ f,n = ωf,n + ∆ωf,n , for f = 1, . . . , F and n = 1, . . . , N . In the following, we derive the theoretical MSEs of ω ˆ f,n based on the result of Appendix B. Given ∆U s , the perturbations of U 1 and U 2 in (20) are ∆U 1 = J 1 ∆U s ,
∆U 2 = J 2 ∆U s .
(36)
According to (21), we have U 1 T D (ζ) = U 2 T . Let T = [t1 t2 · · · tF ], then
(37)
U 1 t f ζf = U 2 t f . Differentiating (37), we obtain
(∆U 1 ζf −∆U 2 )tf +U 1 tf ∆ζf = −(U 1 ζf −U 2 )∆tf . (38)
Since tf is the f -th normalized eigenvector of U †1 U 2 , according to [24, Chapeter 2], we can write ∆tf as a linear combination of the eigenvectors tg , 1 ≤ g ≤ F , g 6= f , such PF that ∆tf = g=1 kgf tg . We substitute it into (38) and obtain g6=f
(∆U 1 ζf − ∆U 2 )tf + U 1 tf ∆ζf = −(U 1 ζf − U 2 )
F X
g=1,g6=f
kgf tg .
(39)
7
From (21), we have P †sp U 2 = D(ζ)P †sp U 1 . Suppose that T P †sp = [s1 s2 · · · sF ] , then sTf (U 1 ζf − U 2 ) = 0T . Also from (21), it can be verified that sTg U 1 tg0 = δg,g0 and sTg U 2 tg0 = ζg δg,g0 , where δg,g0 is the Kronecker delta (δg,g0 = 1 when g = g 0 ; δg,g0 = 0 when g 6= g 0 ). If we multiply both sides of (39) by sTf , we can obtain ∆ζf = −sTf (∆U 1 ζf − ∆U 2 )tf .
(40)
We note that the above analysis is similar to the perturbation analysis of eigenvalues in ESPRIT-type algorithms [6], [10], [12], [18]. The expression of ∆ζf as in (40) was also obtained in [6], [10], [12], [18]. However, here we are interested in the perturbation of eigenvectors. We multiply both sides of (39) by sTg (g 6= f ) and obtain kgf =
−sTg (∆U 1 ζf − ∆U 2 ) tf , ζf − ζ g
1 ≤ g ≤ F, g 6= f.
Therefore we have ∆tf = Rf ∆U s tf , where Rf :=
F X −tg sTg (J 1 ζf − J 2 ) = −T D f P †sp (J 1 ζf − J 2 ) . ζf − ζ g g=1 g6=f
(41) 1 Here D f is a diagonal matrix with [D f ]g,g = ζf −ζ , for g g 6= f and [D f ]f,f = 0. Since the f -th column of P is obtained using pf = U 1 tf , we have ∆pf = ∆U 1 tf + U 1 ∆tf = (J 1 + U 1 Rf )∆U s tf .
(42)
Now we are ready to obtain the perturbation of ωf,n . Differentiating (26), we have jωf,n ∆e , (43) ∆ωf,n = I ejωf,n where ejωf,n is obtained using (25). Differentiating (25) would result in ∆ejωf,n , and we obtain ∆ejωf,n 1 T = τ ∆p , jω f,n e µn f,n f
(44)
where τ Tf,n is defined as τ Tf,n =p−1 1,f φf,1 ⊗ · · · ⊗ φf,n−1 ⊗ ϕf,n ⊗ φf,n+1 ⊗ · · · ⊗ φf,N : K00 × 1, −jω T f,n φf,n = 1 e · · · e−j(Kn −2)ωf,n : (Kn − 1) × 1, T ϕf,n = −1 0 · · · 0 e−j(Kn −2)ωf,n : (Kn − 1) × 1.
Here p1,f is (1, f )-th element of P sp , which is also the f -th diagonal element of Λ. Notice that the nonzero 0elements of K0 with scalτ Tf,n are the reciprocals of corresponding {pk,f }k=1 ing ambiguity, and the zero elements are distributed regularly. Substituting (71) to (42) and (42) to (44), we have T i ∆ejωf,n 1 Xh T ˜T w∗ (t) , ξ w(t) + ξ = f,n,t ejωf,n µn t=1 f,n,t
(45)
8
IEEE TRANSACTIONS ON SIGNAL PROCESSING
where tTf := [tf,1 tf,2 · · · tf,F ], ξ Tf,n (t) = τ Tf,n (J 1 + U 1 Rf )
F X
tf,g Φg,t ,
g=1
T ξ˜f,n (t) = τ Tf,n (J 1 + U 1 Rf )
F X g=1
e g,t , tf,g Φ
e g,t are defined in (71). Since E [w(t)] = 0, it and Φg,t and Φ is ready to see E [∆ωf,n ] = 0. (46) This means that the estimation is unbiased if only the firstorder perturbation is considered. Using the following properties of complex Gaussian noise, σ I M1 M2 δt,s , E R(w(t))(R(w T (s))) = 2 2 σ E I(w(t))(I(w T (s))) = I M M δt,s , 2 1 2 T E R(w(t))(I(w (s))) = E I(w(t))(R(w T (s))) = 0, 2
we can obtain the theoretic MSEs for the frequency estimates as T h 2 i σ2 X h T
2 E (∆ωf,n ) = 2
R ξ Tf,n (t) − ξ˜f,n (t) 2µn t=1
2 i T
+ I ξ Tf,n (t) + ξ˜f,n (t) , (47)
for n = 1, . . . , N , and f = 1, . . . , F .
VI. O PTIMIZING WEIGHTING FACTORS TO MINIMIZE MSE S Based on the MSEs of estimation obtained in the previous section, we now seek to optimize the weighting factors {αn }N n=1 to minimize the MSEs and improve the performance of the proposed algorithm. We notice that minimizing (47) with respect to {αn }N n=1 is too complex to be solved. However, the MSEs can by regarded as asymptotical error variances when the noise power σ 2 approximates zero, and thus we can take its ratio with the Cram´er-Rao Bound (CRB) as an equivalent performance measure, and then use inequalities to bound this ratio so that only entries related to {αn }N n=1 need to be minimized. We have derived the CRB of N -D frequency estimation from (2) using a method similar to that of [22]. The result is given as follows: varCRB (ˆ ωf,n ) =
(48)
where b(n−1)F +f is the (nF − F + f )-th element of following vector h X T H b = Diag R CH IM − e (t)Z H
A A A
−1
A
H
ZC e (t)
We then define the asymptotical efficiency of the algorithm on the estimation of ωf,n as 2 E[∆ωf,n ] . ωf,n ) →0 varCRB (ˆ
ηf,n := lim 2 σ
(49)
We can bound ηf,n using properties of matrix norms such that 1 2 2 ηf,n ≤ 2 kτ f,n k (kJ 1 k + kU 1 k kRf k) · µn b(n−1)F +f
F
F T
X 2 2 X X
e g,t e g,t tf,g Φg,t − Φ tf,g Φg,t + Φ
.
+
t=1
g=1
g=1
In the RHS of the above inequality, only Rf depends N on the weighting
factors {αn }n=1 . Furthermore, kRf k ≤
† kT k kD f k P k(J 1 ζf − J 2 )k. Define α := [α1 · · · αN ]T and ω f := [ejωf,1 · · · ejωf,N ]T . Notice that ζf = ω Tf α (see (21)). Ignoring all the entries that do not depend on α, we find the upper bound of the efficiency ηf,n is decided by
P
N e 2,n ) F
n=1 αn (J 1 ejωf,n − J
X kD f k k(J 1 ζf − J 2 )k = . |(ω Tf − ω Tg )α| g=1 g6=f
(50) Without loss of generality, we assume kαk≤1 since α appears in both the numerator and the denominator of the RHS of (50). If kαk≤1, then |αn |≤1 for n = 1, · · · , N . In this case,
e kD f k k(J 1 ζf − J 2 )k ≤ N kJ 1 k + J 2,n γf (α), (51)
PF 1 g=1 . In order to minimize where γf (α) := T T g6=f |(ω f −ω g )α| the upper bound of the average efficiency η=
N F 1 XX ηf,n , NF n=1
(52)
f =1
we propose to minimize the following cost function γ(α) =
F F −1 X F X 1X 1 γf (α) = . T 2 |(ω f − ω Tg )α| f =1 g=f +1 f =1
(53)
An optimal α can be obtained by
σ 2 b(n−1)F +f , 2
t=1
where C e (t) := I N ⊗ D(c(t)), and Z = z 1,1 z 2,1 · · · z F,1 z 1,2 · · · z F,N : M × N F , z f,n = (θ f,1 ⊗ · · · ⊗ θ f,n−1 ) ⊗ ϑf,n ⊗ (θ f,n+1 ⊗ · · · ⊗ θ f,N ) T θ f,n = 1 ejωf,n · · · ej(Mn −1)ωf,n : Mn × 1, T ϑf,n = 0 jejωf,n · · · j(Mn − 1)ej(Mn −1)ωf,n : Mn × 1.
i−1
αopt = arg min γ(α), subject to kαk≤1. α
(54)
In fact the optimization criterion (54) has a simple interpretation in term of the eigenvalue distribution of U †1 U 2 . We define the difference of eigenvalues as df1 ,f2 := ζf1 − ζf2 = (ω Tf1 − ω Tf2 )α, for 1≤f1 , f2 ≤F . Notice that these differences appear as the denominators in the entries of γ(α) (see (53)) and the diagonal elements of D f (see (41)). If α is approximately orthogonal to some ω Tf1 − ω Tf2 , df1 ,f2 is close
LIU ET AL.: MULTIDIMENSIONAL FREQUENCY ESTIMATION WITH FINITE SNAPSHOTS
to zero, then γ(α) will be very large, and the performance of the algorithm will degrade dramatically due to high MSE. We further use an example to show the effect of α on η. Suppose that three 3-D frequency components are to be estimated from 3 snapshots of 6×6×6 noisy data samples as given in (2). The 3-D frequency components are f = 1 : (ω1,1 , ω1,2 , ω1,3 ) = (0.70π, 0.50π, 0.20π), f = 2 : (ω2,1 , ω2,2 , ω2,3 ) = (0.80π, 0.60π, 0.20π),
(55) (56)
f = 3 : (ω3,1 , ω3,2 , ω3,3 ) = (0.80π, 0.50π, 0.40π).
(57)
Fig. 2 depicts the logarithmicqplot of η as a function of α1 1 and α2 , while α3 is fixed at For simplicity of the plot, 3 .q q 1 1 we also set |α1 | = 3 , |α2 | = 3 , and hence the x- and y-axes in Fig. 2 are the angles of α1 and α2 respectively. As illustrated in Fig. 2, in most cases, α keeps the average efficiency η as small as 1, but when α falls into some “bad regions”, the average efficiency becomes very large. A random selected α does not guarantee that η does not fall into the “bad regions”. Therefore we propose to use (54) to optimize the choice of α. The optimization problem (54) is a so called sum-of-ratios fractional programming problem, which is a difficult global optimization problem [3]. There is no efficient algorithm available to solve it to date. From Fig. 2 we notice that the “bad regions” are often regular and small, most choices of α are fairly good to obtain a small average efficiency. We can use grid search in the super-sphere kαk ≤1 to find a moderate initial value of α, then use a Newton type algorithm to find N an optimal {αn }q n=1 . In order to reduce the complexity, we may set |αn | = N1 , for n = 1, · · · , N , and the search grid does not need to be fine (for example, the step size of angle in one dimension can be set to π/F ). Alternatively we can use the following method to obtain a moderate initial value of {αn }N n=1 . If we define (58) (α) := min (ω Tf − ω Tg )α , 1≤f F , our proof still holds for the corresponding column-reduced or row-reduced square submatrix. f det(H), f Suppose that 2T L = F . The determinant of H, is a polynomial of (N + T )F variables {ejωf,n }N n=1 , {cf (t)}Tt=1 ), f = 1, · · · , F , and thus is analytic in C(T +N )F ×1 . According to Lemma 1, if we can find one point f 6= 0, H f has full column in C(T +N )F ×1 such that det(H) rank almost surely. Define QN 0≤n≤N − 1, p=n+1 Lp , L0n := (61) 1 n = N. If we choose the (T + N )F variables such that νf
cf (t) = cej (t 2T +ν0,f ) , for t = 1, . . . , T, 0
ejωf,n = ejνf Ln , for n = 1, . . . , N, where f = 1, . . . , F , c > 0, νf := 2π f F−1 , and "N # −νf X 1 0 ν0,f := (Mn − 1)Ln + 1 + . 2 2T n=1 f Then, it can be verified that in this case both B in (9) and H in (11) are Vandermonde matrices with generators equi-spaced f 6= 0. Therefore H f has full on the unit circle, and thus det(H) rank almost surely. A PPENDIX B P ERTURBATION OF THE S IGNAL S UBSPACE Given (4), (15), (34), and (35), we derive the first-order perturbation of the signal subspace, ∆U s , when X is perturbed f is perturbed by ∆X. f Suppose that by W , or equivalently, X f the nonzero singular values of X in (34) are λ1 > λ2 > · · · > λF . For the f -th singular vector uf in U s , we have H
fX f uf = λ 2 uf . X f
(62)
Differentiating (62), we obtain H
H
H
fX f uf + X∆ f X f uf + X fX f ∆uf ∆X
= 2λf ∆λf uf + λ2f ∆uf .
(63)
11
According to [24, Chapter 2], if ∆uf is approximately orthogonal to uf , we can express the perturbation ∆uf as a linear combination of ug , where g = 1, . . . , K, g 6= f . Therefore, ∆uf =
K X
qgf ug ,
(64)
g=1,g6=f
where {qgf } are the coefficients to be determined. Substituting it into (63), multiplying both sides from the left by uH g , g 6= f , H H 2 H ff and using ug X X ug0 = λg δg,g0 and ug ug0 = δg,g0 , where δg,g0 is the Kronecker delta (δg,g0 = 1 when g = g 0 ; δg,g0 = 0 when g 6= g 0 ), we have H H fX f + X∆ f X f uf ∆X uH g qgf = , g = 1, . . . , K, g 6= f. 2 2 λf − λ g (65) H f uf = v f λf and Substituting (65) to (64) and using X f UH n X = 0, we obtain H f −1 f ∆uf = U n U H n ∆Xv f λf + U s S f U s ∆Xv f λf H
f + U s S f Σs V H s ∆ X uf ,
(66)
where v f is the f -th column of V s , and S f is a diagonal matrix of size F × F with the (f, f )-th element as 0 and 1 (g, g)-th element as λ2 −λ = 1, . . . , F , g 6= f . 2 , for g g f Notice that the result in (66) is different from the first-order perturbation results of the signal subspace given in [12], where only the contribution of the noise subspace (i.e., the first term of the RHS of (66)) was taken into account. However, (66) is consistent with the results of the eigenvector perturbation analysis of [4], [10], [18]. Next we express ∆uf as a function of W by substituting (33) to (66). First, we split the vector v f into 2T equal-sized sub-vectors such that i h ˜ Tf,T · · · v ˜ Tf,1 . v Tf := v Tf,1 v Tf,2 · · · v Tf,T v Let v Tf,t := [vf,t (1) · · · vf,t (L)] and
˜ Tf,t := [˜ v vf,t (1) · · · v˜f,t (L)],
for t = 1, · · · , T . Define
Qf,t : = J 1,··· ,1 vf,t (1) + J 1,··· ,2 vf,t (2) + · · · + J L1 ,L2 ,··· ,LN vf,t (L),
(67)
e : = J 1,··· ,1 v˜f,t (1) + J 1,··· ,2 v˜f,t (2) + · · · Q f,t
(68)
+ J L1 ,L2 ,··· ,LN v˜f,t (L),
It can be verified that
S [w(t)] v f,t = Qf,t w(t), e ΠM w∗ (t), ˜ f,t = Q S [ΠM w∗ (t)] v f,t
and thus the first two terms on the RHS of (66) can be written as H f −1 f U nU H n ∆Xv f λf + U s S f U s ∆Xv f λf
=
T X t=1
E f Qf,t w(t) +
T X t=1
e ΠM w∗ (t), Ef Q f,t
(69)
12
IEEE TRANSACTIONS ON SIGNAL PROCESSING
H −1 where E f := U n U H n λf + U s S f U s λf . For the third term on the RHS of (66), we can divide the matrix U s S f Σs V H s into 2T blocks such that h i f f = U s S f Σs V H K K K K · · · K · · · f,1 f,2 f,T f,T f,1 . s
Then, it can be verified that H
f U s S f Σs V H s ∆ X uf =
T X
K f,t Uf w∗ (t)
t=1 T X
+
t=1
where
ff,t Uf ΠM w(t), K
(70)
uTf J 1,1,··· ,1 T uf J 1,1,··· ,2 .. . T Uf := uf J 1,1,··· ,LN . uT J 1,1,··· ,2,1 f .. . T uf J L1 ,L2 ,··· ,LN
Summing (69) and (70), we can write (66) as ∆uf =
T X t=1
where
Φf,t w(t) +
T X t=1
e f,t w∗ (t), Φ
(71)
ff,t Uf ΠM , Φf,t = E f Qf,t + K e f,t ΠM + K f,t Uf . e f,t = E f Q Φ ACKNOWLEDGEMENT
The authors would like to thank an anonymous reviewer whose comments have helped to improve the perturbation analysis of the signal subspace. R EFERENCES [1] J. Brewer, “Kronecker products and matrix calculus in system theory,” IEEE Trans. Circuits and Systems, vol. 25, no. 9, pp. 772–781, Sept. 1978. [2] S. Browne, J. Dongarra, N. Garner, and P. Mucci, “A portable programming interface for performance evaluation on modern processors,” The International Journal of High Performance Computing Applications, vol. 14, no. 3, pp. 189–204, Fall 2000. [3] R. W. Freund and F. Jarre, “Solving the sum-of-ratios problem by an interior point method,” Journal of Global Optimization, vol. 19, no. 1, pp. 83–102, Jan. 2001. [4] B. Friedlander and A. Weiss, “On the second-order statistics of the eigenvectors of sample covariance matrices,” IEEE Trans. Signal Processing, vol. 46, no. 11, pp. 3136–3139, Nov. 1998. [5] M. Haardt and J. A. Nossek, “Simultaneous Schur decomposition of several nonsymmetric matrices to achieve automatic pairing in multidimensional harmonic retrieval problems,” IEEE Trans. Signal Processing, vol. 46, no. 1, pp. 161–169, Jan. 1998. [6] Y. Hua and T. K. Sarkar, “Matrix pencil method for estimting parameters of exponentially damped/undamped sinusoids in noise,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 36, no. 5, pp. 814–824, May 1990. [7] Y. Hua, “Estimating two-dimensional frequencies by matrix enhancement and matrix pencil,” IEEE Trans. Signal Processing, vol. 40, no. 9, pp. 2267–2280, Sept. 1992.
[8] Y. Hua and K. A. Meraim, “Techniques of eigenvalues estimation and association,” Digital Signal Processing, Academic Press, San Diego, CA, vol. 7, no. 4, pp. 253–259, Oct. 1997. [9] T. Jiang, N. D. Sidiropoulos, and J. M. F. ten Berge, “Almost-sure identifiability of multidimensional harmonic retrieval,” IEEE Trans. Signal Processing, vol. 49, no. 9, pp. 1849–1859, Sept. 2001. [10] A. N. Lemma, A.-J. van der Veen, and E. F. Deprettere, “Analysis of joint angle-frequency estimation using ESPRIT,” IEEE Trans. Signal Processing, vol. 51, no. 5, pp. 1264–1283, May 2003. [11] A. Leshem, N. Petrochilos, and A.-J. van der Veen, “Finite sample identifiability of multiple constant modulus sources,” IEEE Trans. Information Theory, vol. 49, no. 9, pp. 2314–2319, Sept. 2003. [12] F. Li, H. Liu, and R. J. Vaccaro, “Performance analysis for DOA estimation algorithms: unification, simplification, and observations,” IEEE Trans. Aerospace and Electronic Systems, vol. 29, no. 4, pp. 1170– 1184, Oct. 1993. [13] J. Liu and X. Liu, “An eigenvector-based approach for multidimensional frequency estimation with improved identificability,” IEEE Trans. Signal Processing, vol. 54, no. 12, pp. 4543–4556, Dec. 2006. [14] X. Liu, N. D. Sidiropoulos, and T. Jiang, “Multidimensional harmonic retrieval with applications in MIMO wireless channel sounding,” in A. B. Gershman and N. D. Sidiropoulos, editors, Space-Time Processing for MIMO Communications, Wiley, pp. 41–75, 2005. [15] X. Liu and N. D. Sidiropoulos, “Almost sure identifiability of constant modulus multidimensional harmonic retrieval,” IEEE Trans. Signal Processing, vol. 50, no. 9, pp. 2366–2368, Sept. 2002. [16] K. V. Mardia and P. Jupp, Directional Statistics, 2nd ed., Wiley, 2000. [17] M. Pesavento, C. F. Mecklenbr¨auker, and J. F. B¨ohme, “Multidimensional rank reduction estimator for parametric MIMO channel models,” EURASIP Journal on Applied Signal Processing, vol. 2004, pp. 1354–1363, Sept. 2004. [18] B. D. Rao and K. V. S. Hari, “Performance analysis of ESPRIT and TAM in determining the direction of arrival of plane waves in noise,” IEEE Trans. Signal Processing, vol. 37, no. 12, pp. 1990–1995, Dec. 1989. [19] S. Rouquette and M. Najim, “Estimation of frequencies and damping factors by two-dimensional ESPRIT type methods,” IEEE Trans. Signal Processing, vol. 49, no. 1, pp. 237–245, Jan. 2001. [20] R. Roy and T. Kailath, “ESPRIT-estimation of signal parameters via rotation invariance techniques,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 37, no. 7, pp. 984–995, Jul. 1989. [21] M. Steinbauer, A. F. Molisch, and E. Bonek “The double-directional radio channel,” IEEE Antennas and Propagation Magazine, vol. 43, no. 4, pp. 51–63, Aug. 2001. [22] P. Stoica and A. Nehorai, “MUSIC, maximum likelihood, and Cram´erRao bound,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 37, no. 5, pp. 720–741, May 1989. [23] A.-J. van der Veen, M. C. van der Veen, and A. Paulraj, “Joint angle and delay estimation using shift-invariance techniques,” IEEE Trans. Signal Processing, vol. 46, no. 2, pp. 405–418, Feb. 1998. [24] J. H. Wilkinson, The Algebraic Eigenvalue Problem, New York: Oxford University Press, 1965. [25] M. D. Zoltowski and C. P. Mathews, “Real-time frequency and 2-D angle estimation with sub-nyquist spatio-temporal sampling,” IEEE Trans. Signal Processing, vol. 42, no. 10, pp. 2781–2794, Oct. 1994. [26] M. D. Zoltowski, M. Haardt, and C. P. Mathews, “Closed-form 2-D angle estimation with rectangular arrays in element space or beamspace via unitary ESPRIT,” IEEE Trans. Signal Processing, vol. 44, no. 2, pp. 316–328, Feb. 1996.
PLACE PHOTO HERE
focus on theory
Jun Liu (S’06) received the B.S and M.S degrees in Electronics and Information Engineering from Huazhong University of Science and Technology (HUST), Wuhan, China, in 1998 and 2001, respectively. From 2001 to 2004, he worked as a software engineer at Lucent Technologies, Beijing, China. He is currently pursuing the Ph.D degree with the Department of Electrical and Computer Engineering at University of Louisville. His current research interests are in the areas of signal processing and wireless communications with and applications of multidimensional frequency estimation.
LIU ET AL.: MULTIDIMENSIONAL FREQUENCY ESTIMATION WITH FINITE SNAPSHOTS
Xiangqian Liu (S’99-M’02) received the B.S. degree in Electrical Engineering from Beijing Institute of Technology, Beijing, China, in 1997, the M.S. degree in Electrical Engineering from the University PLACE of Virginia, Charlottesville, in 2000, and the Ph.D. PHOTO degree in Electrical Engineering from the University HERE of Minnesota, Minneapolis, in 2002. Since 2002, he has been an assistant professor of Electrical and Computer Engineering at University of Louisville. His current research interests are in the areas of wireless communications and signal processing, including multiuser detection, channel modeling and estimation, multi-antenna systems, spectral estimation, sensor array processing, as well as signal processing techniques for wireless sensor networks.
13
Maximum number of 2−D frequencies
120
PSfrag replacements
100
ID, Eqn. (30) (T=3) ID, Eqn. (30) (T=2) ID, Eqn. (30) (T=1)
80 60 40 20 0 2
4
6
8
10
12
M1 (M1 = M2 )
Number of snapshots (T )
14
16
(a)
Xiaoli Ma (M’03) received the B.S. degree in Automatic Control from Tsinghua University, Beijing, China in 1998, the M.S. degree in Electrical Engineering from the University of Virginia in 2000, PLACE and the Ph.D. degree in Electrical Engineering from PHOTO the University of Minnesota in 2003. HERE From 2003 to 2005, she was an assistant professor of Electrical and Computer Engineering at Auburn University. Since 2006, she has been with the School of Electrical and Computer Engineering at Georgia Institute of Technology. Her research interests include transceiver designs and diversity techniques for wireless time- and frequency-selective channels, channel modeling, estimation and equalization, carrier frequency synchronization for OFDM systems, routing and cooperative designs for wireless networks. PSfrag replacements Dr. Ma serves as an Associate Editor for IEEE S IGNAL P ROCESSING M1 (M1 = M2 ) L ETTERS, and is a member of the Signal Processing Theory and Methods Technical Committee of the IEEE Signal Processing Society.
Maximum number of 2−D frequencies
55
ID, Eqn. (30) ID, Eqn. (31) ID, Eqn. (32)
50 45 40 35 30 25 20 0
5
10
15
20
Number of snapshots (T )
25
30
(b)
Fig. 1. Identifiability (ID) bound of 2-D frequency estimation: (a) ID bound versus M1 when M1 = M2 ; (b) ID bounds versus the number of snapshots where M1 = 10 and M2 = 6.
TABLE I A N IMPROVED EIGENVECTOR - BASED ALGORITHM USING OPTIMAL WEIGHTING FACTORS
10
1) Given (3), follow (12)–(17) to obtain U sq . 1 2) Randomly select α subject to |αn | = , n = 1, . . . , N , N compute {ˆ ωf,n }N , f = 1, . . . , F , using (19)–(26). n=1 3) Based on {ˆ ωf,n }N n=1 , for f = 1, . . . , F , obtain an updated αopt by first solving (59) using SQP to get initials, then solving (54) using a Newton method. 4) Compute updated {ˆ ωf,n }N n=1 , 1 ≤ f ≤ F , with αopt using (19)–(26). 5) Iterate Steps 3 and 4 until frequency estimates converge (typically one execution of Steps 3-4 is sufficient).
log10 η
8 6 4 2 0
PSfrag replacements
6 6
4
4
2
∠α2
2 0
0
∠α1
Fig. 2. q The average q efficiency as q a function of α1 and α2 1 1 1 . , |α | = , and α = |α1 | = 2 3 3 3 3
14
IEEE TRANSACTIONS ON SIGNAL PROCESSING
Random α Minmax Minmax+Newton Grid+Newton CRB on STD
0
10
MEMP Proposed algorithm Unitary ESPRIT 2−D ESPRIT [18] CRB on STD
0
10
−1
RMSE (Rad)
RMSE (Rad)
10 −1
10
−2
10
−2
10
−3
10 −3
10
0
5
10
15 SNR (dB)
20
25
5
30
10
15
20 SNR (dB)
25
30
35
(a)
(a) 0
10
Minmax Minmax+Newton Grid+Newton
Random α Proposed algorithm Theoretic RMSE CRB on STD
−1
10 RMSE (Rad)
RMSE (Rad)
SNR=20dB
−2
10
−2
10
SNR=25dB −3
10
0
1 2 Number of iterations of Steps 3−4
3
(b) Fig. 3. (a) RMSE of different optimization methods versus SNR; (b) RMSE of different optimization methods versus the number of iterations of Steps 3-4.
5
10
15
20 SNR (dB)
25
30
35
(b) Fig. 4. (a) Comparison of different algorithms for 2-D frequency estimation from single snapshot; (b) Comparison of optimized α and randomly chosen α.
LIU ET AL.: MULTIDIMENSIONAL FREQUENCY ESTIMATION WITH FINITE SNAPSHOTS
Proposed algorithm Unitary ESPRIT RARE CRB on STD
0
−1
10
−2
−1
10
−2
10
10
−3
0
Proposed algorithm Unitary ESPRIT CRB on STD
0
10
RMSE (Rad)
RMSE (Rad)
10
10
15
−3
5
10
15 SNR (dB)
20
25
10 −10
30
−5
0
(a) Random α Proposed algorithm Theoretic RMSE CRB on STD
0
−1
RMSE (Rad)
RMSE (Rad)
20
25
Random α Proposed algorithm Theoretic RMSE CRB on STD
0
10
10
−2
−1
10
−2
10
10
−3
0
15
(a)
10
10
5 10 SNR (dB)
−3
5
10
15 SNR (dB)
20
25
30
10 −10
−5
0
5 10 SNR (dB)
15
20
25
(b)
(b)
Fig. 5. (a) Comparison of different algorithms for 2-D frequency estimation from multiple snapshots; (b) Comparison of optimized α and randomly chosen α.
Fig. 6. (a) Comparison of different algorithms for 3-D frequency estimation from multiple snapshots; (b) Comparison of optimized α and randomly chosen α.
16
IEEE TRANSACTIONS ON SIGNAL PROCESSING
9
3.5
x 10
Proposed Algorithm
Number of operations
3
Unitary ESPRIT
2.5 2 1.5 1 0.5
PSfrag replacements
0 5
6
7
8
9
10
8
9
10
M1 (M1 = M2 = M3 )
(a) 4
x 10
Number of operations
3.5
9
Proposed Algorithm Unitary ESPRIT
3 2.5 2 1.5 1 0.5
PSfrag replacements
0 5
6
7
M1 (M1 = M2 = M3 )
(b) Fig. 7. The number of floating point operations versus M1 when: (a) F = 3; (b) F = 30.