Key words: Complex blind source separation, nonstationary signal, si- multaneous strong .... neously diagonalized subject to certain diagonality measure. 3 Complex .... Separation performance of the proposed CG algorithm. Note that, an ...
Complex Blind Source Separation via Simultaneous Strong Uncorrelating Transform Hao Shen and Martin Kleinsteuber Institute for Data Processing, Technische Universit¨ at M¨ unchen, Germany {hao.shen,kleinsteuber}@tum.de
Abstract. In this paper, we address the problem of complex blind source separation (BSS), in particular, separation of nonstationary complex signals. It is known that, under certain conditions, complex BSS can be solved effectively by the so-called Strong Uncorrelating Transform (SUT), which simultaneously diagonalizes one Hermitian positive definite and one complex symmetric matrix. Our current work generalizes SUT to simultaneously diagonalize more than two matrices. A Conjugate Gradient (CG) algorithm for computing simultaneous SUT is developed on an appropriate manifold setting of the problem, namely complex oblique projective manifold. Performance of our method, in terms of separation quality, is investigated by several numerical experiments. Key words: Complex blind source separation, nonstationary signal, simultaneous strong uncorrelating transform, complex oblique projective manifold, conjugate gradient algorithm.
1
Introduction
In recent years, complex Independent Component Analysis (ICA) has become a prominent method for solving the problem of complex Blind Source Separation (BSS). Its applications can be found in convolutive blind source separation, wireless communication, and magnetic resonance imaging analysis. Although complex ICA and its real counterpart were born as twins [1], surprisingly, it was less understood from both theoretical and practical perspectives than real ICA. Such a difference is mainly due to effects of the so-called circularity or non-circularity of complex signals on ICA models. Generally speaking, (non)circularity describes statistical characteristics of real and imaginary parts of complex signals. A recent work by Eriksson and Koivunen [2] shows that, under certain scenario, where the source signals are non-circular and the values of their circularity coefficients are distinct, complex ICA can be solved effectively by the so-called Strong Uncorrelating Transform (SUT) [3], which utilizes only second-order statistics of observed mixtures. A robust extension of SUT, namely Generalized Uncorrelating Transform (GUT), has been developed as well in [4]. It is worth noticing that both approaches require a whitening of observations, which unfortunately is statistically inefficient in some applications, especially,
2
Lecture Notes in Computer Science: Authors’ Instructions
when additive noise is present [5]. A fixed-point algorithm for computing a single SUT without whitening has been developed in [6]. From an algorithmic point of view, both SUT and GUT require a simultaneous diagonalization of one Hermitian positive definite and one complex symmetric matrix. In this work, we are interested in blind separation of nonstationary complex signals. By exploiting the fact that second-order statistics of the nonstationary signals are time-varying in general, we develop a conjugate gradient (CG) algorithm to simultaneously diagonalize several Hermitian positive definite and complex symmetric matrices. The paper is organized as follows. Section 2 introduces briefly the linear complex ICA problem and motivates a simultaneous SUT solution. In Section 3, we construct an appropriate manifold setting for the problem, namely, the complex oblique projective manifold. Section 4 develops an intrinsic conjugate gradient algorithm for computing a simultaneous SUT. Finally in Section 5, performance of our proposed approach in terms of separation quality is investigated by several experiments.
2
Complex Blind Source Separation
In this work, we denote by (·)T the matrix transpose, (·) the complex conjugate, and (·)H the Hermitian transpose. Let s(t) = [s1 (t), . . . , sm (t)]T ∈ Cm be an m-dimensional complex vector representing the time series of m statistically independent complex signals. The instantaneous linear complex BSS model is given by w(t) = As(t),
(1)
where A ∈ Cm×m is the mixing matrix of full rank and w(t) = [w1 (t), . . . , wm (t)]T ∈ Cm presents m observed linear mixtures of s(t). Without loss of generality, we assume sources s(t) have zero mean and unit variance, i.e., E[s(t)] = 0,
and
cov(s) := E[s(t)sH (t)] = Im ,
(2)
where E[·] denotes the expectation over time index t, and Im is the m×m identity matrix. The expression cov(s) is referred to as the complex covariance matrix of s(t). Furthermore, without loss of generality, we assume that the so-called pseudo-covariance matrix of s(t) has a real diagonal structure, i.e. pcov(s) := E[s(t)sT (t)] = Λ ∈ Rm×m ,
(3)
where Λ := diag(λ1 , . . . , λm ) and diagonal entries λi ≥ 0, for all i = 1, . . . , m, are called circularity coefficient of the corresponding signal si (t). If all λi ’s are zero, then the sources s(t) is called second-order circular. We refer to [2, 6] and references therein for further discussions. The task of the linear complex BSS problem (1) is to recover the source signals s(t) by estimating the mixing matrix A or its inverse A−1 based only on the observations w(t) via the demixing model y(t) = X H w(t),
(4)
Complex BSS via Simultaneous SUT
3
where X H ∈ Cm×m is the demixing matrix, an estimation of A−1 , and y(t) ∈ Cm represents the corresponding extracted signals. According to theorem 11 in [1], a correct demixing matrix X ∗ ∈ Cm×m can only be identified up to column-wise permutation and complex scaling, i.e., X ∗ is the inverse of AH up to an m × m permutation matrix P and an m × m complex diagonal matrix D X ∗ = A−H DP.
(5)
In the rest of this section, we quickly review the SUT approach and then generalize it to simultaneously diagonalize more than two matrices. Given the observation w(t) from the ICA model (1), second-order statistics of w(t) is computed as cov(w) := E[w(t)wH (t)] = AAH (6) and pcov(w) := E[w(t)wT (t)] = AΛAT .
(7)
m×m
A matrix X ∈ C , which transforms both complex covariance (6) and pseudocovariance matrix (7) into the following diagonal forms X H cov(w)X = Im
and
X H pcov(w)X = Λ,
(8)
is called a strong uncorrelating transform of w(t). According to theorem 2 in [3], if the values of circularity coefficients λi are distinct, then any SUT of w(t) is a correct demixing matrix of the problem (1). It is important to notice that transforming the pseudo-covariance matrix into a real diagonal structure as shown in (8) is not necessary for solving the complex BSS problem (1). Now let us consider the situation where sources s(t) are nonstationary. Then in general, both covariance and pseudo-covariance matrices of s(t), and consequently, w(t) as well, are time-varying. One well-known technique to deal with this situation is to construct a set of covariance matrices of w(t) at different time instances, and then simultaneously diagonalize this set [7]. Due to the fact that complex covariance does not necessarily provide complete second-order statistics of complex signals [8], we propose to construct additionally a set of pseudo-covariance matrices, then to simultaneously diagonalize these two sets in the similar manner as SUT (8), which is referred to here as simultaneous SUT. To summarize, in what follows, we are interested in solving the following problem. Given a set of Hermitian positive matrices {Ci }N i=1 and a set of complex symmetric matrices {Ri }N i=1 , the task is to find a nonsingular matrix X such that X H Ci X
and
X H Ri X,
(9)
for all i = 1, . . . , N , are simultaneously diagonalized, or approximately simultaneously diagonalized subject to certain diagonality measure.
3
Complex Oblique Projective Manifold
In this section, we construct an appropriate manifold setting for the problem (1), such that the ambiguity due to column-wise complex scaling as shown in
4
Lecture Notes in Computer Science: Authors’ Instructions
(5) is eliminated. The optimal solutions of complex BSS on this search space are isolated in the generic case. This allows us to: (i) reduce the number of parameters, over which one needs to optimize, and, (ii) simplify the development of our proposed algorithm in the next section. We refer to [9] for deeper insights of the topic. Let us denote by GL(m, C) the set of all m × m invertible complex matrices. Starting point of our construction is the so-called complex oblique manifold, whose real counterpart was developed for real ICA [10], i.e., O(m, C) := X ∈ GL(m, C) ddiag(X H X) = Im , (10) where ddiag(Z) forms a diagonal matrix, whose diagonal entries are those of Z. By the regular value theorem, the set O(m, C) is an m(2m − 1)-real-dimensional differentiable manifold. For a given X = [x1 , . . . , xm ] ∈ O(m, C), it is clear that each column xi identifies a one-complex-dimensional linear subspaces of Cm . It is known that the set of all one-complex-dimensional linear subspaces of Cm forms a differentiable manifold, namely, the (m − 1)-dimensional complex projective space CPm−1 . In this work, we identify it as the set of all rank-one Hermitian projectors, i.e. (11) CPm−1 := P ∈ Cm×m P H = P, P 2 = P, tr(P ) = 1 . Let us denote by u(m) := Ω ∈ Cm×m Ω = −Ω H
(12)
the set of skew-Hermitian matrices. Then, the tangent space at P ∈ CPm−1 is given by TP CPm−1 := {[P, Ω] | Ω ∈ u(m) } (13) with matrix commutator [A, B] := AB − BA. Endowing TP CPm−1 with the inner product g : TP CPm−1 × TP CPm−1 → R,
g(φ, ψ) := R tr(φ · ψ)
(14)
turns CPm−1 into a Riemannian manifold. Here, RZ is the real part of a complex number Z. Then, the geodesic through P ∈ CPm−1 in direction φ ∈ TP CPm−1 is given by γP,φ (t) := et[φ,P ] P e−t[φ,P ] .
γP,φ : R → CPm−1 ,
(15)
Here, e(·) denotes the matrix exponential. Finally, the parallel transport of ψ ∈ TP CPm−1 with respect to the Levi-Civita connection along the geodesic γP,φ (t) is given by τP,φ (ψ) = e[φ,P ] ψe−[φ,P ] . (16) By exploiting the fact that, for a given X ∈ O(m, C), the following matrix H
XX =
m X i=1
xi xH i ,
(17)
Complex BSS via Simultaneous SUT
5
where xi xH i , for all i = 1, . . . , m, is a rank-one Hermitian projector, is positive definite, we construct a set of constrained collections of m rank-one Hermitian projectors as ( ) ! m X m−1 Q(m, C) := (P1 , . . . , Pm ) Pi ∈ CP , det Pi > 0 . (18) i=1
It is worth noticing that Q(m, C) is an open and dense Riemannian submanifold of the m-times product of CPm−1 with the Euclidean product metric, i.e. m Q(m, C) = CPm−1 × . . . × CPm−1 =: CPm−1 . (19) | {z } m−times
Here, Q(m, C) denotes the closure of Q(m, C). It then m follows, that the dimension of Q(m, C) is equal to the dimension of CPm−1 , i.e. dim Q(m, C) = m dim CPm−1 = 2m(m − 1),
(20)
and that the tangent spaces, the geodesics, and the parallel transport for Q(m, C) and (CPm−1 )m coincide locally. In other words, Q(m, C) is not a geodesically complete manifold. Finally, we present the following results about Q(m, C) without further explanations. Given any Υ = (P1 , . . . , Pm ) ∈ Q(m, C), the tangent space of Q(m, C) at Υ is TΥ Q(m, C) ∼ (21) = TP1 CPm−1 × . . . × TPm CPm−1 . Let Φ = (φ1 , . . . , φm ) ∈ TΥ Q(m, C) with φi ∈ TPi CPm−1 for all i = 1, . . . , m, the product metric on TΥ Q(m, C) is constructed as G : TΥ Q(m, C) × TΥ Q(m, C) → R,
G(Φ, Ψ ) :=
m X
R tr(φi · ψi ).
(22)
i=1
The geodesic through Υ ∈ Q(m, C) in direction Φ ∈ TΥ Q(m, C) is given by γΥ,Φ : R → Q(m, C),
γΥ,Φ (t) := (γP1 ,φ1 (t), . . . , γPm ,φm (t)) ,
(23)
and the parallel transport of Ψ ∈ TΥ Q(m, C) with respect to the Levi-Civita connection along the geodesic γΥ,Φ (t) is τΥ,Φ (Ψ ) := (τP1 ,φ1 (ψ1 ), . . . , τPm ,φm (ψm )) .
4
(24)
A CG Algorithm for Simultaneous SUT
In this section we develop a CG algorithm for computing simultaneous SUT. First of all, we adapt a popular diagonality measure of matrices, namely the off-norm cost function, to our problem, i.e. f : O(m, C) → R,
f (X) :=
N X
2 1
2 H H 1
2 off(X Ck X) F + 2 off(X Rk X) F , (25) k=1
6
Lecture Notes in Computer Science: Authors’ Instructions
where k · kF denotes the Frobenius norm of matrices. A direct calculation gives f (X) =
m X N X
H H H H H xH i Ck xj (xi Ck xj ) + xi Rk xj (xi Rk xj )
i