On Bayesian Estimation of Spectral Components

3 downloads 0 Views 667KB Size Report
Abstract. Short-time spectral attenuation is a common form of audio signal enhancement in ... add method of short-time Fourier analysis and synthesis. Taking ...
On Bayesian Estimation of Spectral Components for Broadband Noise Reduction in Audio Signals CUED/F-INFENG/TR.404

Patrick J. Wolfe and Simon J. Godsill Signal Processing Group, University of Cambridge Department of Engineering, Trumpington Street CB2 1PZ, Cambridge, UK pjw47eng. am.a .uk, sjgeng. am.a .uk http://www-sigpro .eng. am.a .uk August 2001 Abstract Short-time spectral attenuation is a common form of audio signal enhancement in which a time-varying filter, or suppression rule, is applied to the frequency-domain transform of a corrupted signal. Here we review suppression rules derived under a Gaussian statistical model and interpret them in a Bayesian framework. In particular, the Ephraim and Malah spectral amplitude estimator is both optimal in a minimum mean-square error sense and well-known for its associated colourless residual noise. However, its implementation requires the evaluation of computationally expensive exponential and Bessel functions. Here we show that under the same modelling assumptions, alternative methods of Bayesian estimation lead to suppression rules exhibiting almost identical behaviour. We derive three such rules and show that, in addition to permitting an efficient implementation, they also yield a more intuitive interpretation.

1 Introduction Herein we address broadband noise reduction for audio signals via statistical modelling of their spectral components. We limit our investigation to short-time spectral attenuation, a popular method of broadband noise reduction in which a time-varying filter, or suppression rule, is applied to the frequency-domain transform of a corrupted signal. We first review existing suppression rules derived under a Gaussian statistical model and interpret them in a Bayesian framework. We then employ the same model and framework to derive three new suppression rules exhibiting near-optimal behaviour. This report is organised as follows: in the remainder of Section 1 we introduce the assumed statistical model and estimation framework, and then employ these in an alternate derivation of the minimum mean-square error (MMSE) suppression rules due to

1

1 INTRODUCTION

CUED/F-INFENG/TR.404

Wiener [1] and Ephraim and Malah [2]. In Section 2 we derive three approximations to the MMSE spectral amplitude estimator of [2], and in Section 3 we investigate and compare the behaviour of these suppression rules. Throughout the ensuing discussion we consider, for simplicity of notation and without loss of generality, the case of a single, windowed, short-time block of audio data. To facilitate a comparison our notation follows that of [2], except that complex quantities appear in bold.

1.1

The Additive Observation Model

The most popular methods of broadband audio noise reduction to date consist of applying a time-varying filter to the frequency-domain transform of a noisy signal. Let xn = x(nT ) in general represent values from a finite-duration analogue signal sampled at a regular interval T , so that a signal in noise may be represented by the additive observation model

yn = x n + d n ; where yn is the observed signal at time n, xn is the original signal, and dn is additive random noise, uncorrelated with the original signal xn . In many implementations the set of observations fyn g is analysed using the overlapadd method of short-time Fourier analysis and synthesis. Taking the short-time Fourier transform (STFT) on length-N intervals yields K frequency bins per interval: Yk

=

Xk + Dk ;

(1)

where these quantities are denoted in bold to indicate that they are complex. Noise reduction in this manner may be viewed as the application of a suppression rule, or nonnegative real-valued gain Hk , to each bin k of the observed signal spectrum Yk , in order to form b an estimate X k of the original signal spectrum: b X k

=

Hk  Y k :

Within this STFT-based framework a simple Gaussian model often proves effective [3]. In this case the elements of fXk g and fDk g are modelled as independent, identically distributed Gaussian random variables with variances 2x and 2d , respectively: 





Xk  N2 0; 2x I2 ;

1.2



Dk  N2 0; 2d I2 :

(2)

Bayesian Estimation of Random Variables

Here we interpret suppression rules based on the model of (2) in a Bayesian framework. Thus, we wish to estimate the value of a random variable X (in our case, the underlying spectral amplitude) as a function of the observations of some related random variable Y

2

1 INTRODUCTION

CUED/F-INFENG/TR.404

(the observed spectral amplitude). Thus we may define a nonnegative cost function C(x; x ^) of x and its estimate x ^, and then minimise the risk R, defined as the average or expected cost. The Bayes estimate minimises the risk with respect to fX;Y (x; y), the joint probability density function of X and Y : R

, ZE[C(Zx; x^)℄

1 1 C(x; x^) fX;Y(x; y) dx dy 1 1 Z1 Z1 = fY (y) C(x; x^) fX Y (xjy) dx dy: 1 1

=

-

-

j

-

(3)

-

^) is by definition nonnegative it is sufficient to minimise the inner integral of (3) As C(x; x to obtain the optimal estimate x ^opt :

x^opt = argmin x^

1.3

Z1

1

-

C(x; x^) fX Y (xjy) dx: j

The Wiener Suppression Rule

A standard method of signal estimation consists of minimising the mean-square error of b of a related random variable X; interpreted in a Bayesian framework this some estimate X MMSE criterion may be viewed as a quadratic cost function. Considering (1), such a cost function is given by

C(xk ; xb k ) = jxb k - xk j2 : opt we thus seek b To determine the optimal spectral estimate x k

argmin E[C(xk ; xk ) jyk ℄ = argmin

Z

b

b xk

b xk

C(xk ; xb k ) p(xk jyk ) dxk :

xk

Bayes’ Rule gives

p(xk jyk ) / p(yk jxk ) p(xk ) and the prior distributions defined in (2) yield the following:

p(yk jxk ) = p(xk ) =

1 exp 2d 1 exp 2x

3

jyk - xk j2 2d ! jxk j2 : 2x

!

1 INTRODUCTION

CUED/F-INFENG/TR.404

Therefore, it is sufficient to minimise

E[C(xk ; xb k ) jyk ℄ =

Z

xk

jy - x j2 jxb k - xk j exp - k 2 k d 2

jxk j2 2x

!

dxk :

(4)

The optimal solution in an MMSE sense is given by the mean of the posterior density appearing in (4), a well-known result that may be stated as the following theorem: Theorem 1 The conditional mean E[XjY ℄ of a random variable X, given a random variable Y , is the MMSE estimate of X given Y . Proof [4, p. 217–18]. mean-square error is

Suppose f(Y ) is some estimate of X given Y . The resultant

E[(X - f(Y ))2 ℄ = E[(X - E[XjY ℄ + E[XjY ℄ - f(Y ))2 ℄ =

E[(X - E[XjY ℄)2 ℄ + E[(E[XjY ℄ - f(Y ))2 ℄

as the cross terms cancel owing to properties of the expectation operator, and therefore

E[(X - f(Y ))2 ℄  E[(X - E[XjY ℄)2 ℄: The mean of the posterior density appearing in (4) follows directly from its Gaussian form:

2 E[Xk jYk ℄ = 2 x 2 Yk : x + d

(5)

The result given by (5) is recognisable as the well-known Wiener filter [1]. We have shown its optimality under the assumed model in a global sense; however, as detailed in Appendix A, one may also obtain a direct proof within the quadratic cost function framework.

1.4

The Ephraim and Malah Suppression Rule

The relative importance of spectral amplitude rather than phase in auditory perception [5, 6] has led researchers to re-cast the spectral estimation problem in terms of the former quantity. In this vein McAulay and Malpass [7] derive a maximum-likelihood (ML) spectral amplitude estimator under the assumption of Gaussian noise and an original signal characterised by a deterministic waveform of unknown amplitude and phase: v u

Hk =

1 1u 2x : + t 2 2 2x + 2d

4

(6)

1 INTRODUCTION

CUED/F-INFENG/TR.404

As an extension of the model underlying (6), Ephraim and Malah [2] derive a minimum mean-square error (MMSE) short-time spectral amplitude estimator based on (2); i.e., under the assumption that the Fourier expansion coefficients of the original signal and the noise may be modelled as statistically independent, zero-mean, Gaussian random variables. Thus the observed spectral component in bin k, Yk , Rk exp(j#k ), is equal to the sum of the spectral components of the signal, Xk , Ak exp(j k ), and the noise, Dk . This model leads to the following marginal, joint, and conditional distributions:

8 < 2akk x p(ak) = :0 Æ1 (

2

p( k ) =

p(ak; k ) = p(Yk jak ; k ) =

0



a2k exp ) x (k)

if ak

2

[0;

1) ;

otherwise: if k 2 [-; ) ; otherwise:

ak exp x (k) 1



d (k)

exp

a2k x (k)

0

B -

(7)

(8)

!

(9) 1

2



Yk - ak ej k

d (k)

C A

(10)

where it is understood that (9) and (10) are defined over the range of ak in (7) and k in (8); x (k) , E[jXk j2 ℄ and d (k) , E[jDk j2 ℄ denote the respective variances of the kth short-time spectral component of the signal and noise. Additionally, define

1 (k)

,  1(k) +  1(k) x

d

and

k ,

k

; 1 + k k

 (k) k , x d (k)

k ,

R2k ; d (k)

where k and k are interpreted after [7] as the a priori and a posteriori signal-to-noise ratios (SNR), respectively. Under the assumed model, the posterior density p(ak jYk ) (following integration w.r.t. the phase term k ) is Rician [8] with parameters (2k ; s2k ):

p(akjYk ) =

ak exp 2k

2k ,

(k) ; 2

!

!

as a2k + s2k I0 k 2 k ; 2 2k k s2k

5

, k(k);

(11)

(12)

2 THREE ALTERNATIVE SUPPRESSION RULES

CUED/F-INFENG/TR.404

where Ii () denotes the modified Bessel function of order i. The mth moment of a Rician distribution is given by 

E[Xm ℄ = 22

m+2 2

m

2

!

m+2 s2  ; 1; 2 2 2

!

!

s2 exp - 2 ; 2

m0

(13)

where () is the gamma function [9, eq. 8.310.1] and () is the confluent hypergeometric function [9, eq. 9.210.1]. The MMSE solution of Ephraim and Malah is simply the first moment of (11); when combined with the optimal phase estimator (the observed phase #k [2]), it takes the form of a suppression rule: 1

b A k = (k) 2

1 = (k) 2

) Hk =

(1:5)

(1:5; 1; k ) exp(-k )

(-0:5; 1; -k )       k k k -k (1 + k )I0 +  k I1 ; exp 2 k 2 2 2

p

(1:5)



(14)

2 Three Alternative Suppression Rules The spectral amplitude estimator given by (14), while being optimal in an MMSE sense, requires the computation of exponential and Bessel functions. We now proceed to derive three alternative suppression rules, all of which closely approximate it while allowing a more efficient implementation.

2.1

Joint Maximum A Posteriori Spectral Amplitude and Phase Estimator

Joint estimation of the real and imaginary components of Xk under either the maximum a posteriori (MAP) or MMSE criterion leads to the Wiener estimator (due to symmetry of the Gaussian posterior distribution). However, as we have seen, the problem may be reformulated in terms of spectral amplitude Ak and phase k ; it is then possible to obtain a joint MAP estimate by maximising the posterior distribution p(ak ; k jYk ):

p(ak; k jYk ) / p(Yk jak ; k )p(ak; k ) 0

/

ak expB 2  x (k)d (k)

2





Yk - ak ej k

d (k)

1

a2k C A: x (k)

Since ln() is a monotonically increasing function, one may equivalently maximise the natural logarithm of p(ak ; k jYk ). Define

J1 = -



2



Yk - ak ej k

d (k)

a2k + ln ak + constant: x (k) 6

2 THREE ALTERNATIVE SUPPRESSION RULES

CUED/F-INFENG/TR.404

Differentiating J1 w.r.t. k yields

 1 h  -j k j (Yk - ak e J1 = )(-jak e k )  k d (k) Setting to zero and substituting Yk

=

i

j -j k +(Yk - ak e k )(jak e )

:

Rk exp(j#k ), we get

0 = ja^k Rk ej(#k- ^ k ) - ja^k Rk e-j(#k - ^ k) = 2j sin (#k - ^ k) ; and therefore

^ k = #k ;

(15)

i.e., the joint MAP phase estimate is simply the noise phase. Differentiating J1 w.r.t. ak yields

 1 h  -j k j J1 = (Yk - ak e )(-e k ) ak d (k)

i

j -j k +(Yk - ak e k )(-e ) -

1 2ak : + x (k) ak

Setting the above to zero implies

 (k) 2a^2k = x (k) - x a^k [2a^k - Rk e-j(#k - ^ k ) - Rkej(#k - ^ k ) ℄ d (k) = x (k) - k a ^k [2a ^ k - 2Rk cos(#k - ^ k )℄: From (15), we have cos(#k - ^ k ) = 1; therefore

0 = 2(1 + k )a^2k - 2Rkk a^k - x(k): Solving the above quadratic equation, and substituting

 x (k) = k R2k ;

k

(16)

we have q

b A k=

k + 2k + 2(1 + k ) kk Rk : 2(1 + k )

Together (17) and (15) define the following suppression rule: q

Hk =

k + 2k + 2(1 + k ) kk : 2(1 + k )

7

(17)

2 THREE ALTERNATIVE SUPPRESSION RULES

2.2

CUED/F-INFENG/TR.404

Maximum A Posteriori Spectral Amplitude Estimator

Recall that the posterior density p(ak jYk ) of (11), arising from integration over the phase term k , is Rician with parameters (2k ; s2k ). For large arguments of I0 () we may substitute the approximation

I0 (jxj) 

1 exp(jxj) 2 jxj

q

into (11), yielding

p(akjYk ) 

1



ak  2 q exp 22k sk 1

  ! 1 ak - sk 2 ; 2 k

(18)

which we note is almost Gaussian. Considering (18), and again taking the natural logarithm and maximising w.r.t. ak , we obtain

1  ak - sk 2 1 J2 = + ln ak + constant 2 k 2 1 2ak 2 0 = a^2k - sk a^k - k : 2

s -a d J2 = k 2 k dak k

+

(19)

Substituting (12) and (16) into (19) and solving, we arrive at an estimator differing from that of the joint MAP solution only by a factor of two under the square root (owing to p the factor ak in (18); replacement with ak would yield the joint MAP spectral amplitude solution): q

b A k=

k + 2k + (1 + k ) kk Rk: 2(1 + k )

(20)

Combining (20) with the Ephraim and Malah phase estimator (i.e., the observed phase #k ) yields the following suppression rule: q

Hk =

k + 2k + (1 + k ) kk : 2(1 + k )

8

3 COMPARISON OF APPROXIMATIONS

2.3

CUED/F-INFENG/TR.404

Minimum Mean-Square Error Spectral Power Estimator

Recall that Ephraim and Malah formulated the first moment of a Rician posterior distribution, E[Ak jYk ℄, as a suppression rule. The second moment of that distribution, E[A2k jYk ℄, reduces to a much simpler formula:

E[A2kjYk ℄ = 22k + s2k ; where 2k and s2k are as defined in (12). Letting Bk in (21) yields

=

(21)

A2k and substituting for 2k and s2k

!

Bk = b

k 1 + k 2 Rk ; 1 + k

k

b where B k is the optimal spectral power estimator in the MMSE sense, as it is also the first moment of a new posterior distribution p(bk jYk ) having a noncentral chi-square probability density function with two degrees of freedom and parameters (2k ; s2k ). When combined with the optimal phase estimator of Ephraim and Malah (i.e., the observed phase #k ), this estimator also takes the form of a suppression rule:

Hk =

v u u t

!

k 1 + k : 1 + k

k

(22)

3 Comparison of Approximations Figure 1 shows the Ephraim and Malah suppression rule as a function of instantaneous SNR (defined in [2] as k - 1) and a priori SNR k . Figures 2, 3, and 4 show the gain difference (in dB) between it and each of the three derived suppression rules (note the difference in scale). Table 1 shows a comparison of the magnitude of gain differences for the three approximations. The MMSE spectral power suppression rule provides the best and most consistent approximation to the Ephraim and Malah suppression rule, with only slightly less suppression in regions of low a priori SNR. The MAP spectral amplitude approximation, although still within 5 dB of the optimal value over a wide SNR range, is the poorest. While the sign of the deviation of each of these two approximations is constant, that of the joint MAP suppression rule depends on the instantaneous and a priori SNR. Ephraim and Malah [2] show that at high SNR, their derived suppression rule converges to the Wiener suppression rule detailed in Section 1.3, formulated as a function of k :

Hk =

k : 1 + k

9

(23)

3 COMPARISON OF APPROXIMATIONS

CUED/F-INFENG/TR.404

10 0

Gain (dB)

−10 −20 −30 −40 −50 −60 30 20

30

10

20 0

10 0

−10

−10

−20

−20 −30

Instantaneous SNR (dB)

−30

A Priori SNR (dB)

Figure 1: Ephraim and Malah MMSE suppression rule

5 4

Gain Difference (dB)

3 2 1 0 −1 −2 −3 −4 −5 30 20

30

10

20 0

10 0

−10

−10

−20 Instantaneous SNR (dB)

−20 −30

−30

A Priori SNR (dB)

Figure 2: Joint MAP suppression rule gain difference

10

3 COMPARISON OF APPROXIMATIONS

CUED/F-INFENG/TR.404

5 4

Gain Difference (dB)

3 2 1 0 −1 −2 −3 −4 −5 30 20

30

10

20 0

10 0

−10

−10

−20

−20 −30

Instantaneous SNR (dB)

−30

A Priori SNR (dB)

Figure 3: MAP approximation suppression rule gain difference

5 4

Gain Difference (dB)

3 2 1 0 −1 −2 −3 −4 −5 30 20

30

10

20 0

10 0

−10

−10

−20 Instantaneous SNR (dB)

−20 −30

−30

A Priori SNR (dB)

Figure 4: MMSE power suppression rule gain difference

11

4 DISCUSSION

CUED/F-INFENG/TR.404 ( k - 1; k )

Suppression Rule MMSE Power Joint MAP MAP Ampl. Approx.

2

[-30; 30℄

Mean Maximum 0.68473 0.52192 1.2612

-1.0491 +1.7713 +4.7012

dB

( k - 1; k )

Range 1.0469 2.3352 4.7012

2

[-100; 100℄

Mean Maximum 0.63092 0.74507 1.7423

-1.0491 +1.9611 +4.9714

dB

Range 1.0491 2.5250 4.9714

Table 1: Magnitude of deviation from Ephraim and Malah MMSE suppression rule gain This relationship is easily seen from the MMSE spectral power suppression rule given by (22), expanded slightly to the following:

Hk =

v u u t

1 k 1 +  k k

!

k : + 1 + k

(24)

As the instantaneous SNR becomes large, (24) may be seen to approach the Wiener suppression rule given by (23). As it becomes small, the 1= k term in (24) lessens the severity of the attenuation. Capp´e [10] makes the same observation concerning the behaviour of the Ephraim and Malah suppression rule, although the simpler form of the MMSE spectral power estimator shows the influence of the a priori and a posteriori SNR more explicitly. Lastly, we note that the success of the Ephraim and Malah suppression rule is largely due to the authors’ decision-directed approach for estimating the a priori SNR k [10]. b For a given short-time block l, the decision-directed a priori SNR estimate  k is given by a geometric weighting of the SNR in the previous and current blocks: 2

b jX (l - 1)j k = k d (l - 1; k) b

+ (1 - ) max [0; k (l) - 1℄ ;

2 [0; 1):

(25)

It is instructive to consider the case in which k = k - 1; i.e., = 0 in (25) so that the estimate of the a priori SNR is based only on the current block. In this case the MMSE spectral power suppression rule given by (24) reduces to the method of power spectral subtraction (see, e.g., [7]). Figure 5 shows a comparison of the derived suppression rules under this constraint.

4 Discussion In the first part of this report we have provided a common interpretation of existing suppression rules based on a simple Gaussian statistical model. Within the framework

12

4 DISCUSSION

CUED/F-INFENG/TR.404

0

−5

−10

Gain (dB)

−15 MMSE Spectral Amplitude Joint MAP Spectral Amplitude and Phase MAP Spectral Amplitude Approximation MMSE Spectral Power

−20

−25

−30

−35

−40 −30

−20

−10 0 10 Instantaneous SNR = A priori SNR (dB)

20

30

Figure 5: Optimal and derived suppression rules of Bayesian estimation we have seen how two MMSE suppression rules may be derived, and have provided a direct proof of the Wiener estimator [1] within the cost function framework. While the Ephraim and Malah estimator [2] is well-known and widely used, its implementation requires the evaluation of computationally expensive exponential and Bessel functions. Moreover, an intuitive interpretation of its behaviour is obscured by these same functions. With this motivation, we have presented in the second part of this report a derivation and comparison of three efficient approximations to the Ephraim and Malah MMSE spectral amplitude estimator. The solutions we have derived may be implemented where increased efficiency is desired, e.g., as in [11], and each may be coupled with hypotheses concerning uncertainty of speech presence, as in [7, 2]. Moreover, the form of the MMSE spectral power suppression rule given by (24) provides a clear insight into the behaviour of the Ephraim and Malah solution. Finally, just as Ephraim and Malah argued that log-spectral amplitude estimation may be most appropriate for speech perception [12], so in other cases may be MMSE spectral power estimation; e.g., when calculating auditory masked thresholds for use in perceptually motivated noise reduction [13].

13

A MMSE COST FUNCTION PROOF

CUED/F-INFENG/TR.404

5 Acknowledgments Material by the first author is based upon work supported under a U.S. National Science Foundation Graduate Fellowship. The authors also gratefully acknowledge the contribution of Shyue Ping Ong to this report.

A MMSE Cost Function Proof We begin by noting that for complex x , xr + jxi [14]:

Fx(x) =

Zx

1

-

fx(x) dx ,

Z xr Z xi

1 1

-

-

fx(x) dxi dxr :

This enables the cost function integral of (4) to be written as (replacing the k subscript with r; i to denote real and imaginary parts, respectively)

F(^xr ; x^i ) =

Z1 Z1

1 1

-

-



2

2

(^ xr - xr ) + (^xi - xi )

2



Now let r = xi = x^i - r sin :

q



exp

2

2

(^ xr - xr ) + (^xi - xi )

(xr ; xi ) (r; )

=

-

2

(yr - xr ) + (yi - xi )

2d





x2r + x2i 2x

;  = arctan x^x^ri --xxir so that xr

- cos 

r sin 

- sin 

-r cos 

!

= x ^r -

dxi dxr : r cos  and



and



(xr ; xi ) = r; (r; )

giving

Z 2 Z 1

2 2! ^i + r sin ) (yr - x ^r + r cos ) + (yi - x 3 F(^xr ; x^i ) = r exp 2d =0 r=0 ! (^ xr - r cos )2 + (^xi - r sin )2 dr d:  exp 2x

14

A MMSE COST FUNCTION PROOF

CUED/F-INFENG/TR.404

Interchanging the limits of integration and simplifying, we obtain:

F(^xr ; ^xi) =

Z1 Z 0



r3 exp (a + b sin  + cos ) d dr;

(26)

-

where 2 2 (y - x ^r ) + (yi - x ^i )  2 + 2 a(r) = - d 2 2 x r2 - r d x 2d ! y - x^ x^ b(r) = 2r 2i - i 2 i x d ! x^r yr - x^r

(r) = 2r 2 : x 2d

-

x^2r + x^2i 2x

A solution to the inner integral of (26) is given by [9, p. 357, eq. 3.338.4]:

F(^xr ; x^i ) =

Z1 0

q

2r3 ea I0

Z1

 = 2e -

0



b2 + 2 dr

r3 e- r I0 ( r) dr 2

(27)

where

1 = 2 x (^xr ; ^xi) = 2 (^xr ; ^xi) =

v u u t

+

1 2d

x^r 2x

-

2

yr - x^r 2d

!

2

x^i 2x

+

(yr - x ^r ) + (yi - x ^i )

2d

2 +

-

yi - x^i 2d

2

!

x^2r + x^2i : 2x

A solution to (27) is given by [9, p. 737, eq. 6.631.1]: ! 2 (2)   2; 1; F(^xr ; x^i ) = 2e 2 2 4 -

where ( ; ; z) is the confluent hypergeometric function [9, eq. 9.210.1]. Letting

z=

2 ; 4

so that 



2d + 2x xb k - 2x yk ; rz = r = 2 2d 2x 15

REFERENCES

CUED/F-INFENG/TR.404

yields the following: 



 - e (2; 1; z) rF = r 2   1 0 2 2 + 2 x  2 -  d x b k - x y k A = e (2(3; 2; z) - (2; 1; z)) : 2 2d 2x Noting that [9, p. 1086, eq. 9.212.3]

2(3; 2; z) = (2; 2; z) + (2; 1; z) z = e + (2; 1; z) we obtain !

F=

r

2x 2 z- bk e x y ; 2d + 2x k

which is zero when xk b

=

2x

y ; 2d + 2x k

the Weiner gain solution. This solution may be shown via the second-derivative test to satisfy sufficient conditions for a minimum, thereby concluding the proof.

References [1] N. Wiener, Extrapolation, Interpolation, and Smoothing of Stationary Time Series: With Engineering Applications, Principles of Electrical Engineering Series. MIT Press, Cambridge, MA, 1949. [2] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-32, no. 6, pp. 1109–1121, Dec. 1984. [3] S. J. Godsill and P. J. W. Rayner, Digital Audio Restoration: A Statistical Model Based Approach, Springer-Verlag, Berlin, 1998. [4] R. M. Gray and L. D. Davisson, An Introduction to Statistical Signal Processing, electronic document, May 2000 edition, available on-line at http://ee.stanford. edu/~gray/sp.html.

16

REFERENCES

CUED/F-INFENG/TR.404

[5] D. L. Wang and J. S. Lim, “The unimportance of phase in speech enhancement,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-30, no. 4, pp. 679–681, Aug. 1982. [6] P. Vary, “Noise suppression by spectral magnitude estimation—Mechanism and theoretical limits,” Signal Processing, vol. 8, no. 4, pp. 387–400, July 1985. [7] R. J. McAulay and M. L. Malpass, “Speech enhancement using a soft-decision noise suppression filter,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-28, no. 2, pp. 137–145, Apr. 1980. [8] S. O. Rice, “Statistical properties of a sine wave plus random noise,” Bell System Technical Journal, vol. 27, pp. 109–157, Jan. 1948. [9] I. S. Gradshteyn and I. M. Ryzhik, Table of Integrals, Series, and Products, Academic Press, Inc., San Diego, fifth edition, 1994. [10] O. Capp´e, “Elimination of the musical noise phenomenon with the Ephraim and Malah noise suppressor,” IEEE Transactions on Speech and Audio Processing, vol. 2, no. 2, pp. 345–349, Apr. 1994. [11] A. Akbari Azirani, R. le Bouquin Jeann`es, and G. Faucon, “Optimizing speech enhancement by exploiting masking properties of the human ear,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, 1995, vol. 1, pp. 800–803. [12] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-33, no. 2, pp. 443–445, Apr. 1985. [13] P. J. Wolfe and S. J. Godsill, “Towards a perceptually optimal spectral amplitude estimator for audio signal enhancement,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Istanbul, 2000, vol. 2, pp. 821–824. [14] C. W. Therrien, Discrete Random Signals and Statistical Signal Processing, Prentice Hall Signal Processing Series. Prentice-Hall, Inc., New Jersey, 1992.

17