Gradient Algorithm for Reference-based Cubic ...

2011 IEEE Statistical Signal Processing Workshop (SSP)

GRADIENT ALGORITHM FOR REFERENCE-BASED CUBIC CONTRAST FUNCTION IN A DEFLATION SCENARIO Fadoua BRAHIM, R´emi DUBROCA, Christophe DE LUIGI and Eric MOREAU University of Sud Toulon Var, ISITV LSEET UMR-CNRS 6017 av. G. Pompidou, BP 56, F-83162 La Valette du Var, Cedex, France [email protected], [email protected], [email protected] ABSTRACT The paper deals with the problem of blind source separation of a MIMO convolutive mixture by a deflation procedure. A criterion based on high order statistics and showing a cubic dependence w.r.t. the unknown equalizer parameters has been recently proposed. In order to optimize efficiently this criterion in a classical deflation scenario, we propose a new algorithm based on a fixed step size gradient. Computer simulations illustrate the good behavior and the usefulness of our algorithm in comparison with other approaches. Index Terms— Contrast Functions, Blind Source Extraction, Higher Order Statistics, Gradient Optimization 1. INTRODUCTION We consider the blind source equalization in a MIMO context. In this case non observable source signals are mixed through an unknown multidimensional convolutive channel. The goal of separation consists of recovering all the source signals up to a permutation and up to a scalar filtering. In the literature, one can encounter two basic ways for separation. In the first one, the source signals are recovered altogether, and the separation is called global (see [1, 2]). In the second way for separation, the source signals are extracted iteratively one by one using a so-called deflation stage (see [3, 4]). In this paper, we consider the second way. Recently, solutions were proposed that consider high order statistics exhibiting a quadratic dependence [5] or a cubic one [6]. It is achieved through the use of the so-called reference signal. The advantage of the proposed “cubic” criterion is twofold. It does not require additional constraints as “quadratic” criterion developped in [5] and shows a low order (lower than the considered cumulant order) parameters dependence. In [6], an algorithm based on the best rank-1 approximation of third order tensors has been proposed to optimize this new criterion and was showed to be very efficient for the extraction of one source signal.

978-1-4577-0568-7/11/$26.00 ©2011 IEEE

293

However this optimization scheme is not very appropriate within a classical deflation procedure. Indeed this algebraic optimization method requires a good knowledge of the filters order due to its sensitivity on the rank estimation. Hence, we propose a new optimization algorithm which is based on a fixed step size gradient and which does not require any rank estimation. In the last section, computer simulations will illustrate the good behavior of our algorithm in comparison with other approaches. 2. MODEL, PROBLEM FORMULATION AND CONTRAST FUNCTION We consider the linear and time-invariant (LTI) MIMO mixing system with K inputs and N outputs described by X x(n) = M(k)s(n − k) , {M ? s}(n) (1) k∈Z

where x(n) is the (N × 1) observation vector (N ∈ N, N > 2), s(n) is the (K × 1) source vector (K ∈ N∗ ), and M(n) is the (N × K) matrix corresponding to the impulse response of the convolutive mixing system and we assume N > K. Moreover n stands for any generic integer (n ∈ Z) representing the discrete time index. Considering that the source signals are extracted iteratively one by one, we are interested by the problem of the extraction of a single source signal. Hence the aim is to estimate a (1 × N ) row vector filter with impulse response w(n), in such a way that the scalar signal y(n) = {w ? x}(n)

(2)

restores one of the sources si (n), i ∈ {1, . . . , K}, up to a nonzero scalar filter. For the above goal, assumptions are required about the source signals and the mixing system. The sources si (n), i ∈ {1, . . . , K}, are assumed zero-mean, unit-variance, stationary random signals. Moreover they are statistically mutually independent (at least up to the order of the considered

cumulants) which is the key property for separation. The mixing system is stable and is left invertible (see [7]), i.e. there exists a system W(n) w(n) in (2) is a row of W(n) such that the global matrix filter G(n) , {W ? M}(n)

(3)

corresponds to the identity system. As the source signals are unobservable, there exist some inherent undetermined factors in their estimation. They can be recovered only up to a permutation and to a scalar filtering. So, the extraction of one source is said to be achieved when there exists an index i0 ∈ {1, . . . , K} such that the filter components in g(n) g(n) is a row of G(n) read ∀i ∈ {1, . . . , K} gi (n) , (g(n))i = g(n)δi,i0 .

(4)

The above relation is called the “extraction condition” and expresses the fact that y(n) is equal to the source signal, si0 (n) up to a filtering by the scalar filter with impulse response g(n). If the sources are temporally independent and identically distributed, the scalar filter is reduced to a delay l and a scaling factor α that is: ∃i0 ∈ {1, . . . , K}, ∃l ∈ Z gi (n) , (g(n))i = αδn−l δi−i0 (5) where α ∈ R, α 6= 0.

3. NEW OPTIMIZATION ALGORITHM In practice, the optimization is achieved through the use of a MIMO-FIR left inverse filter of length D which is causal because of the delay ambiguity. The row vectors which define the impulse response can be stacked in the following (1 × N D) row vector: w , (w(0) . . . w(D − 1)).

(10)

We also define the (N D × 1) column vector x(n) , (x(n)T x(n − 1)T . . . x(n − D − 1)T )T .

(11)

It is then easily seen that the output of the extraction filter can be written has (12) y(n) = w x(n). Considering the covariance matrix R = E{x(n)x(n)H }, we have E{|y(n)|2 } = wRwH . Associated to C3 {y(n)}, we consider the following fourth-order cross-cumulant κ3 {y(n)} , Cum{y(n), y ∗ (n), y(n), r∗ (n)}.

(13)

Now using the multilinearity property of cumulants and (12), we have X κ3 {y(n)} = wi w∗j wk Cum{xi (n), x∗j (n), xk (n), r∗ (n)}. i,j,k

In practice, an extraction filter is found by the optimization of the so-called contrast function. Let us define the following contrast functions

(14) Thus, using the n-mode product of a tensor, this relation can be written as a third order tensor decomposition

C4 {y(n)} = |Cum{y(n), y ∗ (n), y(n), y ∗ (n)}| ,

κ3 {y(n)} = C •1 w •2 w∗ •3 w

(6)

∗

∗

(7)

∗

∗

(8)

C3 {y(n)} = |Cum{y(n), y (n), y(n), r (n)}| , C2 {y(n)} = |Cum{y(n), y (n), r(n), r (n)}| , where r(n) is a given signal called reference signal. 2

It has been proved under the constraint E{|y(n)| } = 1 that C4 , C3 , C2 are a contrast function respectively in [8, 4], [6] and [5]. For C3 and C2 , the condition that r(n) depends linearly on the source signals is required. This paper is concerned with the optimization of the contrast C3 that is E{|y(n)|2 } = 1. w (9) This contrast yields a cubic problem with respect to the parameters and has been optimized efficiently, in the case of the extraction of one source signal, by an algorithm based on the best rank-1 approximation of third order tensors [6]. However, a projection onto a signal subspace is required. The dimension of the signal subspace is unknown in practice and has to be estimated. Furthermore, it was shown that the deflation procedure is very sensitive to this rank estimation. Hence, we propose a new optimization algorithm which is based on a fixed step size gradient in order to overcome this problem. max C3 {y(n)}

under the constraint

294

(15)

where the tensor C is defined component wise as (C)i,j,k = Cum{xi (n), x∗j (n), xk (n), r(n)} .

(16)

Recall that the 1-mode product of a tensor A ∈ CI1 ×I2 ×I3 by a matrix U ∈ CJ1 ×I1 , denoted by A•1 U, is an (J1 ×I2 ×I3 )tensor of which the entries are given by X (A •1 U)j1 i2 i3 , ai1 i2 i3 uj1 i1 . i1

Hence the optimization of the contrast function in (7) under the unit power constraint reads max |C •1 w•2 w∗ •3 w| with wRwH = 1 .

(17)

To maximize (17), we propose to use a gradient method, where the parameter vector is normalized after each gradient step to keep the unit power constraint true. Similarly to the method based on a kurtosis contrast function, this optimization requires to consider a normalized criterion J w.r.t. w defined by: J(w) =

|C •1 w•2 w∗ •3 w| |wRwH |

3

2

.

(18)

The search directions of the maximum of the function J with respect to w are given by ∂J(w) ∂w∗ . To evaluate this gradient, we need first to give the derivative of a n-mode product. It is straightforward, for a tensor A ∈ CI×I×I and a vector w ∈ C1×I that ∂ A•1 w∗ •2 w∗ •3 w∗ ∂w∗

= A•2 w∗ •3 w∗

(19)

+ A•1 w∗ •3 w∗ + A•1 w∗ •2 w∗ . We can write (18) as: J(w) =

(C •1 w•2 w∗ •3 w) (C •1 w•2 w∗ •3 w)

4. SIMULATION RESULTS

∗

3

(wRwH )

.

(20)

Let JN = C•1 w•2 w∗ •3 w and JD = wRwH , then J(w) = ∗ JN JN . Using (19), we obtain the gradient of J(w): J3 D

∂J(w) = ∂w∗ with

and

∂JN ∗ ∂w∗ JN

+

∗ ∂JN ∂w∗ JN

3 JD

−3

∗ ∂JD JN JN ∂w∗ 4 JD

,

(21)

∂JN ∂JD = wR, = C •1 w•3 w ∂w∗ ∂w∗ ∗ ∂JN = C ∗ •2 w•3 w∗ + C ∗ •1 w∗ •2 w . ∂w∗

The proposed algorithm with fixed step µ and -based stop criterion, initialized with w(0) , consists on the following steps: Gradient algorithm Input: µ , , w(0) , C, R Output: w while |J(w(i) ) − J(w(i−1) )| < 1. d =

∂J(w(i−1) ) ∂w∗(i−1)

2. w(i) = w(i−1) + µd 3. w(i) =

w(i) w(i) Rw(i)H

more problematic through these successive steps. The gradient optimization method does not present this drawback. It has been proposed for the quadratic critrion in [5] and generalized for the cubic criterion in [6] an iterative procedure which improves the performances of the extraction stage. It is based on the idea that if the reference signal is closer to a particular source then the extraction performances will be better. In practice, the output of the extraction which has been previously obtained is used as a new reference signal and this procedure may be repeated a Ni number of times.

1/2

In a classical deflation context, the vector of observations is modified every time a new source has been recovered: its contribution is subtracted by a least square approach. The reference signal r(n) is one of the modified observations in order not to extract a same source during the deflation process. This deflation strategy main advantage is its simplicity, but, in counterpart, the estimation errors are increasing through the successive cancellations of the different estimated source signals. Moreover the estimation of the matrix R rank becomes

295

Now we propose computer simulations in order to illustrate the usefulness of the proposed algorithm in comparison with other deflation based separation methods. We consider i.i.d complex-valued 16-QAM source signals. The mixing system is complex-valued and its coefficients are drawn randomly according to a zero-mean and unit variance normal distribution. We have chosen K = 3 source signals, L = 3 for the length of the mixing filter and N = 6 observations. All the results presented below are obtained from a set of 300 MonteCarlo realizations. At each run, the mixing system and the Ne source samples have been drawn randomly. First, we compare the cubic and the quartic higher-order criteria in the case Ni = 1 in a deflation context. The cubic criterion is optimized by the Higher Order Power Method (HOPM) [6] (algorithm named E3HOPM ) which is initialized by the Higher Order Singular Value Decomposition and by the gradient (our proposed new algorithm named E3Grad ). The third algorithm based on the quartic criterion and optimized by the gradient is named E4Grad . For the two gradient algorithms, we have taken the following parameters: w(0) (bN D/2c) = 1 and the other components of w(0) are null, µ = 5.10−3 and = 10−9 . We do not compare these algorithms with the quadratic criterion optimized either by the SVD-based method or the gradient because their results are not good in this case where there is no iterative procedure (see [5, 9]). Figure 1 illustrates how these algorithms behave whenever R is not full rank and unknown. These results show that the optimization by a HOPM-based method fails for the second and the third extraction source respectively for 100% and for 50% of the realizations. That is not the case for the gradient based optimization methods either for the cubic or the quartic criterion. Second, we compare the three previous algorithms plus two others based on the quadratic contrast function: the first algorithm named E2SVD is optimized by a SVD and the second one named E2GRAD is optimized by a gradient. We have also assessed the iterative procedure with Ni = 5 for the reference based quadratic and cubic contrast functions. In tables 1 and 2 we give the average MSE for the three extraction sources in a deflation context for Ne = 5.000 samples respectively with Ni = 1 and Ni = 5. Regarding the results

1

0

E3HOPM

E3Grad

E3Grad

E4Grad

E4Grad

E3HOPM E3Grad E4Grad

0

0

−2

10

−3

10

MSE

10

MSE

10

MSE

10

E3HOPM

−1

−1

10

−2

10

10

−3

0

50

100

150 200 Realizations (sorted)

250

300

10

−1

10

−2

10

−4

10

1

10

10

−3

0

1st extracted source

50

100


250

2nd extracted source

300

10

0

50

100


250

300

3rd extracted source

Fig. 1. Sensitivity of the different methods to the rank estimation in a deflation context.

Separation Method E2SVD E2GRAD E3HOPM E3GRAD E4GRAD

1st 0.1379 0.3060 0.0087 0.0153 0.0059

2nd 0.2748 0.3423 0.1470 0.0196 0.0083

3rd 0.5187 0.5338 0.1568 0.0282 0.0120

tion algorithm and the kurtosis based contrast optimized by a gradient. 6. REFERENCES

Table 1. Average MSE for the 3 source extraction, Ni = 1

in table 1, one can see that the new proposed optimization method associated to the cubic criterion gives good results for the 3 source extraction. As expected, it is not the case for the quadratic contrast optimized by a SVD or a gradient method and for the cubic optimized by a HOPM method. The Separation Method E2SVD E2GRAD E3HOPM E3GRAD E4GRAD

1st 0.0023 0.0075 0.0023 0.0024 0.0024

2nd 0.0975 0.0091 0.100 0.0078 0.0083

3rd 0.1717 0.0135 0.1787 0.012 0.012

Table 2. Average MSE for the 3 source extraction, Ni = 5

results from table 2 show that the iterative procedure, with Ni = 5, improves considerably our proposed approach which gives quite same results than the kurtosis based contrast optimized by a gradient optimization. 5. CONCLUSION In the context of blind separation problem, we have proposed a new optimization algorithm associated to a recent contrast function. It is showed to be well adapted to a deflation scenario. Computer simulations illustrate interesting features and performances in comparison with the algebraic optimiza-

296

[1] P. Comon, “Contrasts for Multichannel Blind Deconvolution,” IEEE Signal Processing Letters, vol. 3, no. 7, pp. 209–211, July 1996. [2] E. Moreau and J.-C. Pesquet, “Generalized Contrasts for Multichannel Blind Deconvolution of Linear Systems,” IEEE Transactions on Signal Processing, vol. 4, no. 6, pp. 182–183, June 1997. [3] P. Loubaton and P. Regalia, “Blind Deconvolution of Multivariate Signals: A Deflation Approach,” in IEEE International Conference on Communications, ICC’93, 1993. [4] J. K. Tugnait, “Identification and Deconvolution of Multichannel Linear Non-Gaussian Processes Using Higher Order Statistics and Inverse Filter Criteria,” IEEE Transactions on Signal Processing, vol. 45, no. 3, pp. 658–672, March 1997. [5] M. Castella, S. Rhioui, E. Moreau, and J.-C. Pesquet, “Quadratic Higher-Order Criteria for Iterative Blind Separation of a MIMO Convolutive Mixture of Sources,” IEEE Transactions on Signal Processing, vol. 55, no. 1, pp. 218–232, January 2007. [6] R. Dubroca, C. De Luigi, M. Castella, and E. Moreau, “A General Algebraic Algorithm for Blind Extraction of One Source in a MIMO Convolutive Mixture,” IEEE Transactions on Signal Processing, vol. 58, no. 5, pp. 2484–2493, May 2010. [7] A. Gorokhov and P. Loubaton, “Subspace-Based Techniques for Blind Separation of Convolutive Mixtures with Temporally Correlated Sources,” IEEE Transactions on Circuits and Systems, vol. 44, no. 9, pp. 813–820, September 1997. [8] C. Simon, P. Loubaton, and C. Jutten, “Separation of a Class of Convolutive Mixtures: A Contrast Function Approach,” Signal Processing, vol. 81, pp. 883–887, 2001. [9] M. Castella and E. Moreau, “A new optimization method for reference-based quadratic contrast functions in a deflation scenario,” in Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on, Philadelphia, USA, April 2009, pp. 3161–3164.

Gradient Algorithm for Reference-based Cubic ...

Gradient Algorithm for Reference-based Cubic ...

Suggest Documents

Nonmonotone Barzilai-Borwein Gradient Algorithm for $\ell_1 ...

Surrogate Gradient Algorithm for Lagrangian ... - Semantic Scholar

accelerated proximal gradient algorithm for frame ...

Surrogate Gradient Algorithm for Lagrangian ... - Semantic Scholar

Projected Nesterov's Proximal-Gradient Algorithm for Sparse

Nonmonotone Adaptive Barzilai-Borwein Gradient Algorithm for ...

Multivariate Spectral Gradient Algorithm for Nonsmooth Convex ...

Speed reduced gradient algorithm for nonlinear

An imperfect conjugate gradient algorithm

A CONJUGATE-GRADIENT BASED ALGORITHM

A FAST ALGORITHM TO COMPUTE CUBIC

A FAST ALGORITHM TO COMPUTE CUBIC

Cubic algorithm for global optimization with box and equality ...

A Novel Cubic-Order Algorithm for ... - ACM Digital Library

An exponential cubic B-spline algorithm for multi

Gradient Descent Efficiently Finds the Cubic ... - Google Sites

An Adaptive Multipreconditioned Conjugate Gradient Algorithm - Hal

Conjugate Gradient Algorithm Design with RLS ...

DSA: Decentralized Double Stochastic Averaging Gradient Algorithm

A Linearly Convergent Conditional Gradient Algorithm with ...

A gradient extremal walking algorithm - Springer Link

Generalized Normalized Gradient Descent Algorithm ...

A New Conjugate Gradient Algorithm Incorporating

An accelerated proximal gradient algorithm for nuclear norm ...