PARAMETRIC FREQUENCY ESTIMATION: ESPRIT AND MUSIC JULIUS KUSUMA Abstract. In many scenarios we are interested in examining the spectral content of a measured signal. In general the methods of estimating the spectral content are either parametric or non-parametric. Parametric methods are those which take advantage of known parameters of the signal, such the number of tones it contains. Non-parametric methods do not make such assumptions apriori. In this document we examine two parametric algorithms: MUSIC and ESPRIT, both of which assume a known number of tones in the measured signal.
1. Introduction We assume a signal model as follows: (1.1)
x[n] =
K X
Ak exp(jωk n) + z[n],
k=1
where Ak ∈ C is a complex number representing the magnitude and phase of the k-th frequency component. We will examine the subspace structure of signals composed of several frequency components, starting with examining its autocorrelation matrix. 2. Subspace decomposition of correlation matrix Recall the autocorrelation of a signal x[n] is defined as: (2.1)
rx [k] = E[x[n]x∗ [n − k]],
and the autocorrelation matrix of x[n] is defined as: rx [0] . . . rx [M − 1] .. .. .. (2.2) Rx = E[xxH ] = . . . . rx [M − 1] . . . rx [0] Now we can eigen-decompose the correlation matrix Rx : (2.3)
Rx = UH Λx U.
But what does this give us? Let’s examine a simple case first: a single frequency component in noise. Date: May 13, 2002. 1991 Mathematics Subject Classification. Signal Processing. Key words and phrases. ESPRIT, MUSIC, Frequency, Estimation. Kannan Ramchandran for comments on improving the presentation of this material. 1
2
JULIUS KUSUMA
2.1. Single frequency component in noise. In this case we have: x[n] = A1 exp(jω1 n) + z[n], where we assume as usual that z[n] is white noise. It can be shown that the autocorrelation of Eqn. 2.1 will be: (2.4)
2 rx [k] = |A1 |2 exp(jω1 k) + σw δ[k] , | {z } | {z } signal component noise component
giving a decomposition of the autocorrelation matrix of Eqn. 2.2: (2.5)
Rx = Rs + Rn ,
where Rx and Rx are the signal and noise contributions respectively. Writing out Eqn. 2.4 in matrix form, we see that: 1 exp(−jω1 ) exp(−jω1 2) ... exp(−jω1 (M − 1)) .. .. . . . . . . exp(−jω1 ) . . Rs = .. . . . .. .. .. . exp(−jω1 ) exp(−jω1 (M − 1)) ... exp(−jω1 2) exp(−jω1 ) 1 and: 2 I. Rn = σw
Now define e1 = [1 expjω1 expjω1 2 . . . expjω1 (M −1) ]. Then: Rs = |A1 |2 e1 eH 1 . By inspection, rank(Rs ) = 1, so there is only one nonzero eigenvalue λ1 6= 0. Moreover, 2 Rs e1 = |A1 |2 e1 eH 1 e1 = M |A1 | e1 .
(2.6)
In other words, e1 is an eigenventor of Rs with eigenvalue λ1 = M |A1 |2 . 2.2. Multiple frequency component in noise. Here we extend the treatment of Sec. 2.1 to include multiple frequency components. First recall Eqn. 1.1: x[n] =
K X
Ak exp(jωk n) + z[n].
k=1
Using the decomposition idea of Eqn. 2.5, it can be shown (for example, see [2]) that similar to the decomposition of a single tone in Eqn.2.6: (2.7)
Rx
= Rs + Rn =
K X k=1
|Ak |2 ek eH + k
| {z } signal component
, σk2 I |{z} noise component
.
MUSIC AND ESPRIT
3
where ek = [1, ejωk , ejωk 2 , . . . , ejωk (M −1) ]. We can rewrite Eqn. 2.7 compactly as: (2.8)
Rx E
Λ
= EΛEH + ω 2 I, = [e1 . . . eK ] , | {z } M ×K matrix |A1 |2 |A2 |2 .. = . |AK |2 0 ... ... ... | {z M ×M matrix
0 .. . .. . 0 0
. }
Note how the autocorrelation matrix nicely decomposes, into the signal subspace and noise subspace. We can write Eqn. 2.8 in a more algebraic form: Rx =
K M X X 2 2 (λi + σw )ui uH + σw ui uH i i , i=1
i=K+1
from which we can similarly collect the terms as: Us Un
= =
[u1 . . . uK ], signal eigenvectors M × K [uK+1 . . . uM ], signal eigenvectors M × K.
Remark 2.1. Note that what we have done so far is identify the subspace where the signal belongs to. Recall from linear algebra that there can be infinitely many different sets of basis which can span the same subspace (for example, think of rotation in the subspace, which we will take advantage of later in ESPRIT). So we still need a little more work before we can get the tones out of the autocorrelation matrix. 2.3. Subspace projection: a constructive idea. Think of how we can take advantage of this decomposition idea by using subspace projection. For example, define: (2.9)
Ps
= Us UH s
Pn
= Un UH n
But first let’s explore a few techniques that’s been well-developed in the literature. 3. Pisarenko harmonic decomposition This was developed in 1973 by Pisarenko [6], published in a geophysics journal in the UK. The main idea uses Caratheodory’s theorem (This theorem gives a bound on how big the set we need to be able to capture the dynamics of parameters of interest) which is of fundamental importance in estimation theory [1]. This algorithm is intuitive, but is very sensitive to noise. First set M = K + 1, such that dim(Us ) = K and dim(Un ) = 1. Suppose we have the eigenvalue decomposition of the autocorrelation matrix as 2 in Eqn. 2.3. There is only one noise eigenvalue/vector, denoted by λmin = σw and
4
JULIUS KUSUMA
umin . This eigenvector is orthogonal to subspace of the signal, or in other words, to every basis function of the signal subspace: umin ⊥ usignal ⇐⇒ umin annihilates the signal components Recall the shorthand that we have used ek = [1, ejωk , ejωk 2 , . . . , ejωk (M −1) ]. Then the above statement is equivalent to: eH i umin =
K X
umin [k]e−jωi k = 0,
k=0
or, Umin (ejω ) =
K X
umin [k]e−jωi k
k=0
has zeros at frequencies ωk , for k = 1, . . . , K − 1. In other words: (3.1)
Umin (z) =
K X
umin [k]z −k =
k=0
K Y ¢ ¡ 1 − ejωk z −1 k=0
which is usually called the annihilating filter. Proposition 3.1. The annihilating filter of Eqn. 3.1 has zeros on the unit circle at angles corresponding to the frequencies of the signal. We can find the frequencies PK ωk , for k = 1, . . . , K − 1, by finding the zeros of Umin (ejω ) = k=0 umin [k]e−jωi k . Suppose that the eigenvectors ui are unit norm. Rx ui uH i Rx ui K X
2 |Ak |2 |eH k uk |
= = =
λi u i λi u H u i = λi i hiX 2 |Ak |2 ek eH + ω I = λi ui k w
=
2 λi − σw .
k=1
Proposition 3.2. We can estimate the magnitudes |Ai | from the above equation. Example 3.3. Let x[n] = A sin(ω1 n + ϕ) + w[n] where the noise is white. First estimate the autocorreation sequence rˆx [k] from the available data for k = 0, 1, 2. Now form the autocorrelation matrix and eigen-decompose: rx [0] rx [1] rx [2] Rx = rx [1] rx [0] rx [1] rx [2] rx [1] rx [0] Rx = U =
UΛUH [u1 , u2 , u3 ]
(3.2) Suppose λ1 ≥ λ2 > λ3 . Form the annihilating filter from the eigenvector of the minimum eigenvalue umin = u3 : Umin (z) =
2 X k=0
umin (k)z −k ,
MUSIC AND ESPRIT
5
find the roots which will be at (or hopefully near) z = e±jω1 . 3.1. Pseudo-spectra. From Prop. 3.1, we can create an artificial spectra by evaluating the annihilating filter at different frequencies. So we construct e(jω0 ) for different frequencies ω0 and evaluate: Pˆ (ejω0 ) =
(3.3)
1 |e(ω0 )H umin |2
If we let take different values of ω0 we can plot what’s called a pseudo spectra, which has peaks where the zeros are. This is a good way to see how the spectra might look like, instead of using the nonparametric approach such as the periodogram. 4. MUSIC: MUltiple SIgnal Classification This was developed by Schmidt [8], originally developed around 1979. The main idea is really just using averaging to improve the performance of the estimator of Pisarenko. Now we pick M > K + 1. Let λ1 ≥ λ2 ≥ . . . λK ≥ λK+1 ≥ . . . ≥ λM , | {z } | {z } K signal eigenvalues M-K noise eigenvalues along with their associated eigenvectors. Now instead of just one annihilating filter, we construct M − K such noise eigenfilters: (4.1)
Ui (z) =
M −1 X
ui [m]z −m , i = K + 1, ..., M.
m=0
Observe that every single one of these filters will have M −1 roots by the fundamental theorem of calculus, K of which are common, and are at the angles corresponding to the frequencies of the signal in consideration. The other M − K − 1 zeros are spurious, and some of them can be close to the unit circle. Proposition 4.1. This is more appropriately a “trick” than a proposition. All the ui ’s are orthogonal to the signal subspace, so we find the K common zeros by averaging! After averaging, we simply do the same computation as we did for Pisarenko. 4.1. Pseudo-spectra. Similarly we can also generate the pseudo spectra based on MUSIC. So we construct e(jω0 ) for different frequencies ω0 and evaluate: Pˆ (ejω0 ) = PM k=K+1
1 |e(ω0 )H (u)k |2
,
or using our projection idea in Eqn. 2.9, we get a nice compact formula: (4.2)
Pˆ (ejω0 ) =
1 . e(ω0 )H Pn e(ω0 )
If we let take different values of ω0 we can plot what’s called a pseudo spectra, which has peaks where the zeros are. This is a good way to see how the spectra might look like, instead of using the nonparametric approach such as the periodogram.
6
JULIUS KUSUMA
We give an example1 of the pseudo spectra when there are 3 exponential components in the desired signal in Fig. 1. Note that they all have common zeros at the frequency of the desired signal. FFT of Eigenfilters 0
−10
Magnitude
−20
−30
−40
−50
−60 0
20
40
60
80 100 Angle (Degrees)
120
140
160
180
Figure 1. Magnitude of eigenfilters
5. ESPRIT: Estimation of Signal Parameter via Rotational Invariance Technique There’s a long story about this one. It’s not clear whether Paulraj or Roy [7] was the first inventor of ESPRIT. We start our development by considering a cute “trick” which is the basic idea behind ESPRIT. Proposition 5.1. Consider the case when K = 1, and no noise. Consider a vector of observations of x[n] = A1 exp (jω1 n), called x: x = =
[x0 , x1 , . . . , xN −1 ] A1 × [1, ejωk , ejωk 2 , . . . , ejωk (M −1) ].
Now we partition this vector as follows: x =
[x0 , x1 , . . . , xN −2 , xN −1 ] | {z }
=
[x0 , x1 , . . . , xN −2 , xN −1 ] {z } |
s1
s2
now note that: s1 = ejω1 s2 . So in the noisy case, we can use some sort of averaging to get a better estimate of what ω1 is. How can we extend this to the multiple signal case? 1The Matlab code used was developed based on previous work by Prof. Mike Zoltowski at Purdue University.
MUSIC AND ESPRIT
7
To extend it to the multiple signal case, we start again with the autocorrelation matrix: Rx = UH Λx U.
(5.1)
Now let us define two matrices: £ ¤ (5.2) Γ1 = IM −1 | 0(M −1)×1 (M −1)×M £ ¤ Γ2 = 0(M −1)×1 | IM −1 (M −1)×M , which we will use to select the first and last (M − 1) columns of an (M × M ) matrix respectively. We call these the selector matrices. We use them as follows: (5.3)
S1 S2
= =
Γ1 U Γ2 U.
Now we use a very very very important property. Theorem 5.2 (Rotational invariance). For the matrices defined in Eqn. 5.3, for every k denoting the different frequency components, we have: [Γ1 e(ωk )] ejωk = Γ2 e(ωk ) which we can compactly write as: [Γ1 U] Φ = Γ2 U. Where Φ = diag(ejω1 , ejω2 , . . . , ejωK ). From here we can see the light at the end of the tunnel. Recall Rem. 2.1, where we argued that we’ve so far only identified the signal up to the subspace and that it’s invariant to transformation by a unitary matrix on the subspace. Let’s do a little more work to tie the Thm. 5.2 with the favorite engineering tool in linear algebra, namely the eigenvalue decomposition, using some unitary matrix T: (5.4)
Γ1 (UT)Φ = Γ2 UT Γ1 U (TΦTH ) = Γ2 U, | {z } eig
where in the first line we apply a unitary transformation on U, and in the second line we use the property that UUH = I. Now we’re ready to unconfuse ourselves by summarizing the steps in the algorithm: (1) Estimate rx [k] and form Rx x of size at least K × K. (2) Calculate eigendecomposition Rx = UH Λx U. Pick the eigenvectors corresponding to the K largest eigenvalues and form the matrix Us = [u1 | . . . |uK ]. The signal subspace is spanned by this matrix. (3) Solve the equation (Γ1 Us )Φ = Γ2 Us ), get Φ. (4) Compute the K eigenvalues of Φ. In the ideal case the eigenvalues are: {ejωk }, k = 1, ..., K.
8
JULIUS KUSUMA
5.1. Least squares solution. In solving the rotational invariance formula of Thm. 5.2, we can also use the least-squares solution. The original problem is: (Γ1 Us )Φ = Γ2 Us ), which least-squares solution is given by: H −1 Φ = (UH Us Γ1 Γ2 Us . s Γ1 Γ1 U s )
(5.5)
And we can apply what we know about least-squares solutions to analytically derive the mean-square error of this estimate. 6. Application: smart antennas, interference rejection We can use the frequency estimation algorithms which we have developed in the context of smart antennas, especially applicable to narrowband wireless communication systems.
@
@
@
@
@
@
@
@
@
@
@
@
θ
@
@ @
@
@
@
@
@ @
@ @ @
@ @
@
@
0
d
2d
(M − 1)d
+x
Figure 2. Uniform antenna array Consider Fig. 2. Suppose we have M equi-spaced antennas on a line configuration of distance d from each other, and an incoming waveform at angle θ. We take a spatial snapshot of the incoming signal, by taking M simultaneous samples at the M antennas in the array. The dashed lines illustrate the incident wavefront, in which the angles are equal. It can then be shown that the spatial samples will be a function of the center frequency, the speed of light and the incident angle θ. We can write the samples at antenna m at time n when there are K incoming signals as: (6.1)
x[m; n] =
K X
sk [n]ejωk m + w[m; n], m = 0, ..., M − 1
k=1
where ωk = 2π λ d cos(θk ) and sk [n] is the message of the k-th signal. Note that by using Eqn. 6.1 we have related the problem of finding the direction of arrival (DoA) of the different signals. This is especially useful for beamforming
MUSIC AND ESPRIT
9
and interference rejection, where we would like to be able to separate out interfering users by their DoAs. When the signals of interest are narrowband, that means that the message component sk [n] changes very slowly with time, and thus spatial translation. Therefore we can consider the message component to be a constant: (6.2)
x[m] =
K X
sk ejωk m + w[m], m = 0, ..., M − 1
k=1
which is a very familiar problem statement for frequency estimation. Remark 6.1. Note that from the expression ωk = 2π λ d cos(θk ) we can see that we cannot disambiguate signals coming in from the “front” of the array and from the “back” of the array. Thus in real-world implementation, other configurations are also considered (such as circular [5] or triangular [4]). Now we are ready to go back to our task of separating the different users. WLOG consider the k = 1 user to be the “good” user, and all other signals to be the interfering users. What we would like to do is to be able to filter reject the signals of the bad users while passing the signal of the good user. 6.1. Spatial filter. Before we continue further, consider the solution if the form of linear combining of the input signals from the different receive antennas: (6.3)
y[n] =
M −1 X
h∗ [m; n]x[m; n],
m=0
or in vector form: y[n] = hH M xn .
(6.4)
We can recognize this as the ever-familiar FIR filter, except that now it operates in space and time. Therefore this is called a space-time FIR filter. Consider the effect on the samples of signals: (6.5)
hH M xn ¡ ¢ H = sk [n] hH M e(ωk ) + hM v[n]
y[n] =
where e(ωk ) is defined as usual, and contains the complex exponential components. 6.2. Minimum-variance solution. We first write our problem as a constrained optimization problem: · ¸ 2 (6.6) min E |y[n]| hM
subject to
hH M e(ω1 ) = 1
where the first user is the desired user. First we rewrite the optimization problem as follows: · ¸ (6.7) min E hH R h M x M hM
subject to
hH M e(ω1 ) = 1
10
JULIUS KUSUMA
From here using the Lagrange method we obtain the solution is given by: (6.8)
opt hM =
R−1 x e(ω1 ) e(ω1 )H Rx e(ω1 )
We can also show the pseudo-spectra given by the minimum-variance beamforming algorithm: (6.9)
ˆ jω0 ) = S(e
1 . e(ω0 )H R−1 x e(ω0 )
6.3. MUSIC and ESPRIT. We can use our previously developed frequency estimators to find the angles of arrival, and then designing filters as we like. We give an example of the performance in Fig. 3.
True "poles" 90 1 ESPRIT eigenvalues 120
60
0.5
150
30
180
0
210
330 240
300 270
Figure 3. ESPRIT estimate of DoA
7. Comments As with many other fields, there is a high degree of factionalism in frequency estimation. European researchers tend to be biased towards MUSIC, and Americans towards ESPRIT. For evidence, see any frequency estimation textbook! A good paper which compares both in the context of direction estimation is by Kangas, Stoica and S¨oderstr¨ om [3]. It is shown that in the intermediate range of modeling error, ESPRIT outperforms MUSIC. In general MUSIC is more accurate than ESPRIT. References 1. P. Billingsley, Probability and measure, second edition, John Wiley and Sons, New York, 1986. 2. M. Hayes, Digital signal processing and modeling, Wiley, New York, NY, 1996. 3. A. Kangas, P. Stoica, and T. S¨ oderstr¨ om, Finite sample and modeling error effects on ESPRIT and MUSIC direction estimators, IEE Proc. Radar, Sonar Navigation 141 (1994), 249–255. 4. J. Liberti and T. Rappaport, Smart antennas for wireless cdmmunications: IS-95 and third generation CDMA applications, Prentice-Hall, Englewood Cliffs, NJ, 1999.
MUSIC AND ESPRIT
11
5. C. P. Matthews and M. D. Zoltowski, Eigenstructure techniques for 2-D angle estimation with uniform circular arrays, IEEE Transactions on Signal Processing 42 (1994), no. 9, 2395–2407. 6. V. F. Pisarenko, The retrieval of harmonics from a covariance function, Geophysics J. Roy. Astron. Soc. 33 (1973), 347–366. 7. R. Roy and T. Kailath, ESPRIT - Estimation of Signal Parameters via Rotational Invariance Techniques, IEEE Transactions on Acoustics, Speech, and Signal Processing ASSP-37 (1989), 984–995. 8. R. O. Schmidt, A signal subspace approach to multiple emitter location and spectral estimation, Ph.D. thesis, Stanford University, Stanford, CA, 1981. MIT Lab. for Information and Decision Systems E-mail address:
[email protected]