Document not found! Please try again

A Maximum Likelihood Approach to Nonlinear

0 downloads 0 Views 475KB Size Report
Abstract A novel learning algorithm for blind source separation of post- nonlinear convolutive mixtures with non-stationary sources is proposed in this paper. The proposed ...... a performance measure and defined as follows: 11. 22. 12. 21. P. P.
A Maximum Likelihood Approach to Nonlinear Convolutive Blind Source Separation Jingyi Zhang, L.C. Khor, W.L. Woo and S.S. Dlay School of Electrical, Electronic and Computer Engineering University of Newcastle upon Tyne Newcastle upon Tyne, NE1 7RU UNITED KINDOM Email: [email protected], [email protected], [email protected], [email protected]

Abstract A novel learning algorithm for blind source separation of postnonlinear convolutive mixtures with non-stationary sources is proposed in this paper. The proposed mixture model characterizes both convolutive mixture and post-nonlinear distortions of the sources. A novel iterative technique based on Maximum Likelihood (ML) approach is developed where the ExpectationMaximization (EM) algorithm is generalized to estimate the parameters in the proposed model. The post-nonlinear distortion is estimated by using a set of polynomials. The sufficient statistics associated with the source signals are estimated in the E-step while in the M-step, the parameters are optimized by using these statistics. In general, the nonlinear maximization in the M-step is difficult to be formulated in a closed form. However, the use of polynomial as the nonlinearity estimator facilitates the M-step tractable and can be solved via linear equations.

1. INTRODUCTION The study of blind deconvolution so far has concentrated solely on the linear mixture [1], [2] and the existing methods only perform well when the mixture is assumed to be linear. Where nonlinear distortion in the mixture is considered, all of the previous works focused only on instantaneous mixing of signals [3]. To the best of the author’s knowledge, the problem of post-nonlinear convolutive mixture of non-stationary sources has not been previously addressed. However in practical applications such as speech processing, source signals are inherently deconvolved in a real acoustic environment where signals are corrupted by noise and interferences. Furthermore, studies show that carbon-button microphones present evidence of a "phantomformant" which occurs when simple static nonlinearities were applied to speech signals. The non-uniform flux of the permanent magnet and the nonlinear response of the suspensions in the loudspeaker also contribute to the nonlinear distortions in speech signals. Therefore, an accurate representation of the mixed signals must be developed to account for the existence of the nonlinearity. The observed signal xt at time t of the noisy post-nonlinear convolutive mixture can be expressed as follows:

Jingyi Zhang, L.C. Khor, W.L. Woo and S.S. Dlay  L xt  g   Ml st l   l 0

where vector

   nt  

(1)

s t is the unknown non-stationary source signals, g(.) is the nonlinear

function and M l is the delayed mixing matrix. The additive noise nt is assumed to be Gaussian. As in typical BSS problems, the aim is to estimate

s t , g(.) , Ml and nt

with only information on xt available. Existing algorithms e.g. [1, 2] do not cater for both the convolutive and non-stationary properties in the mixture and hence perform poorly in solving the problem described in equation (1). This paper presents a pioneering contribution to the noisy post-nonlinear convolutive mixing problem where we propose for the first time an innovative solution to the problem. A state space model representing the post-nonlinear convolutive mixtures of non-stationary signals and a general EM framework is formulated for estimating s t ,

g(.) , M l and nt . The generalized EM algorithm is derived from a set of polynomials to estimate the nonlinear distortion whose coefficients are updated as part of the mixing parameters. In the proposed algorithm, the sufficient statistics associated with the source signals are inferred in the E-step and the model parameters are updated in the M-step.

2. THE MODEL The state space model representing the post-nonlinear convolutive mixture of nonstationary signals is constructed by two parts. First, autoregressive (AR) process is adopted to represent the temporal correlation of the non-stationary sources. The Kth order AR(K) process for source i can be modeled as (2) si,t  hi,t ,1si,t 1  hi,t ,2si,t 2   hi,t ,K si,t K  vi,t The source signal vector at time t is formed by stacking each source signal [4, 5] and can be expressed as T stT  [s1, t

sT 2,t

sT ] I ,t s

The vector for individual source is now formed by stacking the last K signals, which can be expressed as si,t  [si,t

si,t 1

si,t  K 1]T

Second, the convolutive mixture and nonlinearity distortion are introduced into the model. To represent the convolutive mixture, the instantaneous observation matrix is extended to the full matrix of filters, which can be expressed as follows:  m11  M m  I o1

m1I s    m I o I s  

, mij  mij,1 mij,2

mij, L 

where mij,l represents the lth delayed path between the sensor i and source j (L=K). Hence, the proposed post-nonlinear convolutive model is now expressed as

A Maximum Likelihood Approach to Nonlinear Convolutive Blind Source Separation

st  Ht st 1  vt xt  g(Mst )  nt

(3)

The matrix H t is the evolution matrix. W and R are the covariance matrices of the zero mean Gaussian noise vectors v and n , respectively. The prior probability distribution over initial states of the source signals s1 is assumed to be Gaussian with mean μ and covariance Λ . To satisfy the independence between the source signals, the associated parameters in (3) need to be defined in the following form: Ht  diag[H1,t

H2,t

W  diag[W1 W2

h  H I s ,t ] , Hi ,t   i,t  , hi,t  [hi,t ,1 hi,t ,2 I 0   w j  j2  1 WI s ] , ( Wi ) j j   i 1 1 2  0 otherwise

hi,t , K ]

(4)

μ and Λ are defined in the similar way. As mentioned in [6], blind deconvolution of non-stationary signals can be achieved by segmenting them with windows in which the source signals can be assumed to be stationary. Hence, the post-nonlinear convolutive model is now expressed as: stn  Htnstn1  vtn

xtn  g(Mstn )  ntn , n  1,2,

(5) ,N Hence, a total number of N segments are observed. In the next section, the learning rules of the generalized EM algorithm is derived at a point where the nonlinearity is linearized by using second order Taylor Series, the Kalman recursion is then used to infer the relevant statistics in the E-step while the post-nonlinear distortion is estimated by a set of polynomials in the M-step.

3. LEARNING RULES To derive the generalized EM algorithm for proposed model (5), the likelihood function is introduced as L( )  log p(x |  )  log  p(x, s |  )ds where  denotes all the parameters in the proposed model (5). Based on Jensen’s inequality L( )  log p(x |  )  log  p(x, s |  )ds  log    pˆ (s)log

pˆ (s) p(x, s |  )ds pˆ (s)

p(x, s |  ) ds  1( , pˆ )  2 ( pˆ )  F ( , pˆ ) pˆ (s)

(6)

where 1( , pˆ )   pˆ (s)log p(x, s |  )ds and 2 ( pˆ )   pˆ (s)log pˆ (s)ds . It is well known that in the E-step the maximization of F ( , pˆ ) with respect to pˆ (s) is achieved when pˆ (s) is chosen to be exactly the conditional distribution of s with the parameters obtained in the previous iteration pˆ (s)  p(s | x, [q ] ) at which point the

Jingyi Zhang, L.C. Khor, W.L. Woo and S.S. Dlay

bound becomes an equality. Then in the M-step, 1 ( , pˆ ) is maximized with respect to  . Each iteration is guaranteed not decrease F ( , pˆ ) .

3.1 E-step The relevant statistics of the posterior distribution of the source signals p(st | x1: , [ q ] ) needs to be inferred and represented with the parameters obtained in the previous iteration in order to update the parameters in the proposed model. For the linear convolutive mixture model, this is achieved by using the Kalman smoother. The algorithm of Kalman smoother consists of two parts: a forward recursion named Kalman filter which uses the observation from x1 to xt and a backward recursion

which uses the observation from x to xt1 where  be the length of the observed signals. However, for the model defined by equation (5) the conditional densities are in general non-Gaussian and can lead to an intractable solution. To solve this problem, the Extended Kalman Smoother (EKS) is used in the E-step. The theory of the EKS is that the basic Kalman Smoother is applied at a linearized point of the nonlinear system. At this point, the nonlinearity is linearized by second order Taylor n Series at the mean of the current filtered (not smoothed) state sˆt |t 1 . Hence, after the linearization process, the derivative matrix of the vector-valued function g, at point sˆtn|t 1 is defined as D n sˆ

t |t 1



g n n stn st sˆt |t 1

(7)

and the model (5) can be expressed as stn  Htnstn1  vtn xtn  g(Msˆtn|t 1)  D n sˆ

t|t 1

(stn  sˆtn|t 1)  ntn

(8)

Therefore, given the output the conditional distribution of the hidden states in the linearized model (8) at every instant in time is Gaussian. Hence, the basic Kalman Smoother can be applied on the model (8) to infer the associated statistics of the conditional distribution. The inferred first order statistics is the source conditional mean sˆ tn for segment n , which is expressed as sˆtn  stn  where . denotes for the integral over the source posterior p(st | x1: , [ q ] ) . The inferred second order statistics of the hidden source signals are the autocorrelation matrix of source i for segment n denoted as Ciin ,tt without time delay and Ciin ,t (t 1) with time delay and expressed as follows: Ciin ,tt  sin,t (sin,t )T   ciin ,1,tt 

ciin ,2,tt

ciin, L,tt  

T

(9a)

A Maximum Likelihood Approach to Nonlinear Convolutive Blind Source Separation

Ciin ,t (t 1)  sin,t (sin,t 1)T   ciin,1,t (t 1) 

ciin,2,t (t 1)

ciin, L,t (t 1)  

T

(9b)

Here, the first element in ciin ,1,tt is defined as ciin,1,tt and the autocorrelation matrix for stn is defined as Cttn . Because the source signals are statistical independent, for different source i and j Cijn ,tt  sˆin,t (sˆ nj ,t )T .The model parameters are then estimated to maximize the likelihood in (6) in the following M-step with the relevant statistics represented in the form of model parameters obtained from the M-step of the previous iteration. 3.2 M-step In the M-step, 1 ( ) in (6) is maximized with respect to all the model parameters by using the relevant statistics obtained from the E-step. Represented in the form of model parameters, 1 ( ) can be expressed as: N  Is

Is

Is

i 1

i 1

1( , pˆ )   12   log det Λin  (  1) log win   log det R   n 1  i 1

 Is



(sin,1  μin )T ( Λin )1(sin,1  μin )     1 ( sin,t  (hin,t )sin,t 1)2    n t  2 i 1 wi

(xtn  g(Msˆtn|t 1)  D n

sˆ t |t 1

t 1

(stn  sˆtn|t 1))T R 1(xtn  g(Msˆtn|t 1)  D n

sˆ t |t 1

 (stn  sˆtn|t 1))  

(10)

For segment-wise parameters, the update equations are exactly the same as the ones for linear convolutive mixture as in [5], and the new estimator for segment n of source i is given by the following closed form equations: μin  sˆin,1 , Λin  Ciin ,11  μin (μin )T  1 wn  1 ciin,1,tt  hin,t (ciin ,1,t (t 1) )T n n n i      1 hi,t  cii,1,t (t 1)  Cii,(t 1)(t 1)  ,    t 2





(11)

Then μ , Λ be reconstructed following the definitions in Section 2. However, the update equations for M and R which include the statistics from all observed segments are different from the ones for linear deconvolution and more complex. Because the new estimator for M cannot be expressed in a closed form, the update equation for M is derived from the gradient ascent algorithm and its elements mij is estimated from the following equation n

n

, Htn , Wn can

mij ,t 1  mij ,t  

1( , pˆ ) |mij mij ,t mij

(12a)

Io I s N   1( , pˆ )     J1n gi (sˆ nj ,t |t 1)T  J 2n gi(sˆ nj ,t |t 1)T  J 3n gi (sˆ nj ,t  sˆ nj ,t |t 1)T    gh rih1mhk3 mij n 1 t 1  h 1 k3 1

Cnk3 j,tt  sˆkn3,t (sˆnj,t|t 1)T  sˆkn3,t|t 1(sˆnj,t )T  sˆkn3,t|t 1(sˆnj,t|t 1)T  gi 

(12b)

Jingyi Zhang, L.C. Khor, W.L. Woo and S.S. Dlay Io

J1n    xkn1,t rik11  k1 1 Io

Io

Io

k2 1

q2 1 Io

 gk2 rik21   riq21 gq 2 mq2 (sˆtn  sˆtn|t 1)

J 2n    x np2 ,t rp21imi (sˆtn  sˆtn|t 1)  p2 1

T tr{mik m pq



Cnqk ,tt

Io

J 3n    x np1,t rp11i  p1 1

I s Io I s

 gq1 rq11i mi (sˆtn  sˆtn|t 1)     rip1gp

q1 1 n n  sˆ q,t (sˆ k ,t|t 1)T  sˆ nq, t|t 1(sˆ nk, t )T

q 1 p 1 k 1

 sˆ nq, t|t1(sˆ nk, t|t1)T

}

Io

 gq5 rq51i

(12c)

q5 1

Where mi  mi1

miI s  ,  is the learning rate, rij1 is the ijth element of the inverse

matrix of R , gi represents gi

  m sˆ

n ij j ,t |t 1

 , and gi is the first order derivative with

respect to the argument  mij sˆ nj ,t |t 1 . The covariance matrix R can be estimated from the following: R

1 N

N 



 xtn (xtn )T  xtngT  xtn (sˆtn  sˆtn|t 1)T DsTˆtn|t 1  g(xtn )T  ggT  g(sˆtn  sˆtn|t 1)T DsTˆtn|t 1

n 1 t 1 

D

(sˆtn  sˆtn|t 1)(xtn )T  D

D

Cttn  sˆtn (sˆtn|t 1)T  sˆtn|t 1(sˆtn )T  sˆtn|t 1(sˆtn|t 1)T  DTstn|t 1 

sˆ n t |t 1 sˆ n t |t 1

sˆ n t |t 1

(sˆtn  sˆtn|t 1)g T



(13)

where g represents g(Msˆtn|t 1 ) . To estimate the nonlinearity g , a self-adaptive algorithm is required. Only the statistics obtained from the E-step is available for the algorithm. The scalar function g (.) of g is approximated by a set of polynomials [7, 8] defined below: gi (mi sˆtn|t 1 ) 

Z

 ai, z (mi sˆtn|t 1 ) z  ai qin and ai  ai,0

where ai , z are the coefficients of the polynomials represents the order of expansion and qi  1 misˆtn|t 1 coefficient

ai , z

can be

(14)

z 0

updated

as one

ai, Z  , z

T

(misˆtn|t 1) Z  . The polynomial 

of the

model

parameters by

maximizing 1 ( , pˆ ) . The update equation for ai , z is obtained by gradient ascent algorithm with learning rate  , which can be expressed as ˆ) 1 ( , p ai , z ,t 1  ai , z ,t   |ai , z  ai , z ,t ai , z

(15)

A Maximum Likelihood Approach to Nonlinear Convolutive Blind Source Separation 1  ai, z  Io Io  N   Io  Z         xkn ,t rik1   (a k q kn )rik1   rik1   zak , z (m k sˆtn|t 1) z 1  m k (sˆtn  sˆtn|t 1) , z  0 2 2 3 3   3   1 1 2 3 n 1 t 1  k1 1 k2 1 k3 1  z 1     Io  Io   N    Io n 1  n z n 1 n z 1 n n n z 1        xk ,t rik  (mi sˆt |t 1)    a k2 q k rik  (mi sˆt |t 1)   xk3 ,t rk imi (sˆt  sˆt |t 1) z (misˆt |t 1 )  2 2 3  n 1 t 1   k1 1 1 1  k  1 k  1  2  3   Io  Io  Z   1 n z  1 n n n z   r   zak , z (m k sˆ ) m k (sˆt  sˆt |t 1)(mi sˆt |t 1)   a k q nk rk1imi (sˆtn  sˆtn|t 1) z (misˆtn|t 1) z 1 4 t |t 1 5 5 5  4  k 1 ik4  z 1 4 k5 1   4   Io   Z    tr{miTm k Cttn  sˆtn (sˆtn|t 1)T  sˆtn|t 1(sˆtn )T sˆtn|t 1(sˆtn|t 1)T }rik1   zak , z (m k sˆtn|t 1) z 1   z(misˆtn|t 1) z 1  , z  1,       z 1    k 1 













(16) Thus, with all model parameters estimated in the M-step the EM algorithm alternates between the E-step and M-step until if converges.

4. RESULTS The proposed algorithm is evaluated based on its performance in the separation of a post-nonlinear convolutive mixture of two independent speech signals with additional Gaussian noise with different signal-to-noise ratio (SNR). Both the Kth order of the AR process and L are set to 3. The full convolutive mixture matrix M is randomly selected and described by the following matrix: 0.7 0.9 0.1 0.3 0.1 0 M  0.8 0 0.2 1 0.26 0.1

(17)

The post-nonlinear distortions are selected as g1 (1 )  tanh(1 ) and g2 ( 2 )   2   23 . The function g1 (1 ) is bounded while g2 ( 2 ) unbounded and this selection is taken merely to study the performance of the proposed algorithm under two different forms of nonlinearity. The observation signals are segmented into segments of time length   90 . All model parameters are estimated by the proposed algorithm for each segment of each test signals with different SNR. To compare the performance of the signal separation between the proposed algorithm and the Olsson-Hansen algorithm [5] for linear convolutive mixtures, the signal to interference ratio (SIR) is adopted as a performance measure and defined as follows: P  P22 (18) SIR  11 P12  P21

where Pij is the power of the signal which is contributed the ith estimated source signal to the jth original source signal where the normalized cross-autocorrelation is used. For our evaluation, a high SIR value is desirable. The superiority of the proposed algorithm over the Olsson-Hansen algorithm is demonstrated by the significant improvements shown in the Table 1. The results proves the importance of incorporating a nonlinear model in the algorithm in cases where the observed signals

,Z

Jingyi Zhang, L.C. Khor, W.L. Woo and S.S. Dlay

have been nonlinearly distorted. The proposed algorithm is also robust under high level of noise in the separation of post-nonlinearly mixed signals. SNR 10dB

20dB

30dB

Olsson-Hansen algorithm [5]

6.3

8.2

9.1

Proposed algorithm

10.2

12.6

13.5

SIR (dB)

Table 1: Performance comparisons

5 CONCLUSIONS In this paper, a novel Maximum Likelihood approach based on EM algorithm for post- nonlinear convolutive model of non-stationary signals has been proposed. The state space represented model extends the linear instantaneous mixture model to the post-nonlinear convolutive mixture. To update the model parameters, the EM algorithm is generalized where the Extended Kalman Smoother is adopted to infer the hidden source signals and a set of polynomial is utilized to estimate the post-nonlinear distortion. Experimental results show for given nonlinear data set, the proposed algorithm performs significantly better than the linear algorithm by over 50%.

REFERENCES [1] [2] [3] [4] [5] [6] [7] [8]

H. Attias and C.E. Schreiner: ‘Blind source separation and deconvolution: the dynamic component analysis algorithm’, Neural Computation, 10 (6): 1373-1424, 1998. S. Cruces-Alvarez, A. Cichochi, L. Castedo-Ribas: ‘An Inversion Approach to Blind Source Separation’, IEEE Trans. on Neural Networks, Vol. 11, No. 6, Nov. 2000. W.L. Woo and S.S. Dlay: ‘Neural Network Approach to Blind Signal Separation of Mono-nonlinearly Mixed Signals’, IEEE Transactions on Circuits and System - Part 1 2005, 52(6), 1236-1247. G. Doblinger: ’An adaptive Kalman filter for the enhancement of noisy AR signals’, IEEE Int. Symp. on Circuits and Systems, Vol. 5, pp. 305-8, 1998. R.K. Olsson, L.K. Hansen: ‘Probabilistic blind deconvolution for non-stationary source’, 12th European Signal Processing Conference, pp. 1697-1700, 2004. L. Parra and C. Spence: ‘Convolutive Blind Separation of non-stationary Sources’, IEEE Trans. on Speech and Audio Processing, Vol. 8, No. 3, May 2000. W.L. Woo and S.S. Dlay: ‘Nonlinear Blind Source Separation using a Hybrid RBFFMLP Network’, IEE Proceedings on Vision, Image and Signal Processing, 152(2), 173183, 2005. W.L. Woo and L.C. Khor: ‘Blind Restoration of Nonlinearly Mixed Signals using Multilayer Polynomial Neural Network’, IEE Proceedings on Vision, Image and Signal Processing, 151(1), 51-61, 2002.

Suggest Documents