AN ALGORITHM FOR THE BLIND IDENTIFICATION OF N INDEPENDENT SIGNALS WITH 2 SENSORS Anisse Taleb ATRI-Curtin University of Technology GPO Box U1987-Perth WA 6845. Australia. email:
[email protected] ABSTRACT
observed signals write as
This paper presents a novel procedure for the blind identification of a linear mixture of n sources with 2 sensors. It is shown that the second characteristic function obeys a partial differential equation (PDE) whose coefficients are directly related to the mixture coefficients. A uniqueness result allows the design of an estimation procedure based on this PDE. An algorithm is therefore proposed and computer experiments illustrate its performances.
x1 (t) =
1. INTRODUCTION When the number of sources is greater than the number of sensors, exact source separation becomes impossible. Despite its high interest, the case where the number of sources is greater than the number of sensors, often called underdetermined source separation, has only been treated in particular cases. In [2], Cardoso shows how more sources than sensors can be identified by using only the fourth order cumulants. Cao et al. [1] give necessary and sufficient conditions for the existence of a separating matrix, which separates the sources into several groups. Comon et al. [4] address the problem of separation of discrete sources by forming virtual measurements in order to increase the observation vector dimension. These supplementary observations are nonlinear functions of the truly observed sensor outputs. In a more theoretical framework [3], Comon presents blind identification algorithms based on quantic decompositions. An algorithm for the case of 3 sources and 2 sensors is presented in [6]. Finally, DeLathauwer et. al. [7] presents techniques for the identification of the mixture of complex valued signals based on the use of certain symmetries of the forth order cumulant tensor. In this paper we consider the problem of blind identification of n independent sources by the use of only 2 sensors. We only consider a static mixing model, in which the
n X i=1
ai si (t); x2 (t) =
n X i=1
bi si (t)
(1)
The source signals, s1 (t); s2 (t); : : : ; sn (t) are assumed real valued and statistically independent. Extension of the results to complex signals is possible but is beyond the scope of this paper. Temporal dependencies will not be considered, therefore one may as well consider each source signal as a random variable. In the model (1), observation noise is deliberately missing, in fact, if one assumes Gaussian noise, then by an appropriate linear transformation the observed signals can be written as the model of equation (1) with two supplementary independent sources. The objective of blind identification is to estimate, given a finite sample, the coefficients of the mixture (ai ; bi ); i = 1; : : : ; n. These are assumed to be real and pairwise linearly independent ai bj aj bi 6= 0; 8i 6= j . In fact, if (ak ; bk ) is proportional to (ap ; bp ) then the source sk adds with sp to form a new independent source thus reducing the model order to n 1 independent sources.
2. UNIQUENESS OF SOLUTIONS The model (1) contains inherent indeterminacies that cannot be resolved without additional information about the sources. First of all is the permutation indeterminacy, in fact the sources in model (1) are unordered and lead to unordered mixture coefficients. The second indeterminacy is related to the amplitude exchange between one source and its coefficients. A theorem proved by Darmois in 1953 [5] (see also [8] for easier accessibility) discusses the uniqueness of the model (1): Theorem 1 Let x1 and x2 be two random variables admit-
ting the following two decompositions:
x1 = x2 =
n X i=1 n X i=1
ai si = bi si =
p X i=1 p X i=1
i yi
(2)
i yi
(3)
where the components of the decomposition si ’s or yi ’s are independent, and ai bj aj bi 6= 0 as well as i j j i 6= 0; 8i 6= j . If a couple (ai ; bi ) is not proportional to any couple (j ; j ); j = 1; : : : ; p then si is necessarily Gaussian. As a consequence of this theorem, the model (1) is essentially unique provided no source is Gaussian, moreover, the number of non-Gaussian sources is unique. Hence, identifiability of non-Gaussian sources and their number is possible. 3. THE BASIC PARTIAL DIFFERENTIAL EQUATION Let x1 ;x2 denote the logarithm of the joint characteristic function of the random variable x1 ; x2 of model (1), x1 ;x2 (u; v ) = log E [exp(iux1 + ivx2 )℄;
(u; v) 2 R2
(4) where is the largest rectangular domain of R2 containing the origin and where the characteristic function of the couple x1 ; x2 do not vanish. Such a domain exists and is non-empty due to the continuity of the characteristic function. Taking into account the independence of the sources, x1 ;x2 may be expanded, with obvious notations, into x1 ;x2 (u; v ) =
n X i=1
si (ai u + bi v );
(u; v) 2
(5)
Assume that each si admit derivatives up to the n-th order in the interval i = fai u + bi v; (u; v ) 2 g. A sufficient condition for the existence of such derivatives consists of the existence of absolute moments up to the n-th order: E [jsi jk ℄ exists and finite for all k n. Let the differential operators, Di defined by:
Di = b i u a i v
(6)
where u and v denote respectively the partial derivative operator with respect to u and v . It is then straightforward to establish that:
(
n Y
i=1
Di
x1 ;x2 )(u; v ) = 0;
(u; v) 2
(7)
which shows that if x1 ; x2 follow the model (1), the second characteristic function of the couple x1 ; x2 satisfies a linear
homogeneous PDE of the n-th order provided existence asQn sumptions hold. The differential operator i=1 Di can be expanded as n n X Y (8) Di = qi un i vi i=1 i=1 leading to the following PDE n X n x1 ;x2 (u; v) (9) qi =0 un ivi i=0
and qi ; i = 0; : : : ; n are functions of (ai ; bi ); i = 1; : : : ; n. For this result to be useful, one need a uniqueness result in the sense that if x1 ; x2 is a linear mixture then the coefficients qi are unique, up to certain indeterminacies. Here we will show a rather more general result. Suppose that x1 ; x2 is a linear mixture from model (1) and that all sources are non-Gaussian. Furthermore, let the following hold for some integer p and a set of coefficients
i ; i = 0; : : : ; p p X (u; v) p
i x1p;xi2 i = 0 (10) u v i=0
by the expansion (5) we then have n X (11)
i s(pi ) (ai u + bi v) = 0; (u; v) 2
i=1 P with i = pj=0 j api j bji . Now since (ai ; bi ) are pairwise linearly independent by lemma 1.5.1 of [8], the functions
i s(pi ) ; i = 1; : : : ; p; are necessarily polynomials of a certain degree and so is i si . However, by Marcinkiewicz theorem, these polynomials must be of a degree at most equal to 2. Since all the sources are assumed non-Gaussian, this P implies that i = 0, i.e. pj=0 j api j bji = 0. Let F (x; y ) denote the polynomial p X F (x; y) = j xp j yj (12) i=0
then 8i = 1; : : : ; n; F (ai ; bi ) = 0. However, the polynomial F can have at most p linearly independent roots which shows that p must be greater or equal to n. The polynomial F can therefore be written as n Y F (x; y) = Q(x; y) (bi x ai y) i=1 p n X X = Q(x; y) qi xn i yi = i xp i yi (13) i=0 i=0 Qn where Q is the quotient polynomial of F by i=1 (bi x ai y) whose degree is equal to n p. Equation (13) shows how the coefficients qi and i are related, consequences of this equation are:
n is the smallest integer p such that there exists coefficients i ; i = 1; : : : ; p not all equal to zero such that p X (u; v) p
i x1p;xi2 i = 0; (u; v) 2
u v i=0
If n = p, then i are unique up to a multiplication by a constant, moreover (ai ; bi ) are the roots of the Pn polynomial i=1 i xn i y i .
In the case where Gaussian noise is present, and the number of sources n 3, the PDE will be of order strictly greater than two which simply eliminates the presence of the Gaussian noise. In fact, the second characteristic function of a Gaussian random variable is a polynomial of the second order. This observation is similar to the insensitivity of cumulants of order greater that 3 to additive Gaussian noise. When the number of sources equal the number of sensors n = 2, then the previous result can be linked to the result of Yeredor [9] who used the diagonalisation of the Hessian of the second characteristic function for the estimation of the mixing matrix. Based on this result consistent estimation of the number of independent non-Gaussian sources as well as the mixture coefficients is possible. In the next section, the estimation of the mixture coefficients will be addressed. The estimation of the number of independent sources is highly relevant in many applications, the results of this section have a potential application to this problem. 4. ESTIMATION PROCEDURE
n x1 ;x2 (u; v) will then write as un ivi n log ^x1 ;x2 (u; v) un i vi
(14)
2. compute (ai ; bi ); i = 1; : : : ; n as the roots of the polynomial p X F (x; y) = qj xp j yj i=0 In practice, step (1) of the estimation procedure will requires the partial derivatives of the logarithm of the characteristic function on . These can be estimated consistently by the use of the empirical characteristic function, which given a sample of data, can be written as : 1 TX1 exp(jux (t) + jvx (t)) (15) ^x1 ;x2 (u; v) = 1 2 T t=0
(16)
h(u; v) denote the columnn vector of partial deriva x ;x (u; v) tives whose ith component is and let q = un i vi T (q ; q ; : : : ; qn ) . Furthermore, let (uk ; vk ); k = 1; : : : ; K be K arbitrary points of , then if q verifies (14) then one can write h(u ; v )TT h(u ; v ) q = Hq = 0 (17) .. . h(uK ; vK )T Let
1
0
2
1
2 6 6 6 4
3
1
1
2
2
7 7 7 5
by choosing K = n and by appropriately choosing the K points, then can be uniquely determined up to a scaling. In practice, since the partial derivatives are estimated from a finite sample these may be far from their true values and equation (14) will be impossible to fulfill for all values of (u; v). In order to increase stability, one may select a number of points K n and solve (17) in the least-squares sense under the constraint T = 1. It is then well known that is the eigenvector of < y corresponding to the smallest eigenvalue. To summarize, the estimation procedure steps are :
q
qq HH
q
1. Select K points (uk ; vk ) 2 ; k 2. for each point
Based on the previous result, the estimation of the mixture coefficients can be done in two steps given the number of sources n: 1. find q0 ; q1 ; : : : ; qn such that n X n x1 ;x2 (u; v) = 0; (u; v) 2
qi un ivi i=0
The estimate of
= 1; : : : ; K .
(uk ; vk ) estimate
and form the matrix
H.
q
n log ^x1 ;x2 un i vi (uk ;vk )
H y H corresponding
3. Compute the eigenvector ^ of < to the smallest eigenvalue.
4. Compute (^ai ; ^bi ) as the roots of the polynomial F (x; y) = Pp p j yj . q ^ x i=0 j The selection of the K sampling points is a delicate procedure, moreover for an optimal sampling scheme one needs to derive the theoretical performances. However, in practice, several ad hoc schemes show nearly equal performances, these include rectangular, random and circular sampling. The computation of the roots of the polynomial, Step (4), may lead to complex roots for small sample size. This pathological case can be overcome by choosing different sampling points and/or increasing K and restarting the procedure. A better alternative is to directly parametrize the vector by its real roots and solve (17) by a nonlinear leastsquares procedure which leads to a complex nonlinear multidimensional algorithm.
q
0.4 Gaussian Noise var=0.1 Noise free 0.38
0.36
Mean Gap
0.34
0.32
0.3
0.28
practical applications. Several refinements of the procedure can be performed, in particular, the selection of an optimal sampling scheme as well as the implementation of robust and/or weighted least squares. Computer simulations illustrate the performances of the estimation procedure and its robustness with respect to additive Gaussian noise. It is clear that this procedure can be easily extended to deal with a number of sensors p 2 and will be the subject of a forthcoming paper. 7. REFERENCES
0.26
0.24 100
200
300
400
500 600 Sample Size
700
800
900
1000
Figure 1: Mean Gap. 5. SIMULATION RESULTS Computer simulations of the previous estimation procedure have been carried out with 4 binary sources with support f 1; 1g in both noisy and noise free scenarios. As has already been pointed out by Comon [3], it is quite difficult to define a good performance measure since the matrix of coefficients is restored up to a right multiplication with a diagonal and a permutation matrix. Nevertheless, Comon defines a measure ”the gap” between two matrices, it is defined as the minimal sum of the scaled distance between the normalized columns of the two matrices over the set of possible permutations. Figure 1 shows the result of the average of 100 experiments. Taking into account that the range of variation of the gap is [0; 8℄, it is easily seen that the gap is quite small compared to its maximal value 8. The procedure performs also well with additive Gaussian noise even if performances degrade compared to the noise free case. This robustness is due to the high order derivatives of the log of the characteristic function which ’eliminate’ the presence of quadratic terms. 6. CONCLUSION When two random variables can be expressed as a linear combination of n independent random variables their characteristic function obeys a linear PDE which is unique up to a scale factor. This key observation combined with a uniqueness result both on the coefficients of the mixture and the number of sources allow the design of an estimation procedure. It has also a potential application for the detection of the number of independent sources which is highly relevant in many
[1] X. Cao and R. Liu. General approach to blind source separation. IEEE Trans. S.P., 44(3):562–571, March 1996. [2] J-F. Cardoso. Super-symetric decomposition of the fourth-order cumulant tensor. blind identification of more sources than sensors. In ICASSP 91, pages 3109– 3112, Toronto (Canada), May 1991. [3] P. COMON. Blind channel identification and extraction of more sources than sensors. In SPIE Conference, pages 2–13, San Diego, July 19-24 1998. keynote address. [4] P. Comon and O. Grellier. Nonlinear inversion of underdetermined mixtures. In ICA 99, Aussois (France), January 1999. submitted. [5] G. Darmois. Analyse g´en´erale des liaisons stochastiques. Rev. Inst. Internat. Stat., 21:2–8, 1953. [6] L. DeLathauwer, P. Comon, and B. DeMoor. ICA algorithms for 3 sources and 2 sensors. In Sixth Sig. Proc. Workshop on Higher Order Statistics, pages 116–120, Caesarea, Israel, June 14–16 1999. [7] L. DeLathauwer, B. DeMoor, and J. Vandwalle. ICA techniques for more sources than sensors. In Sixth Sig. Proc. Workshop on Higher Order Statistics, pages 121– 124, Caesarea, Israel, June 14–16 1999. [8] A.M. Kagan, Y.V. Linnik, and C.R. Rao. Characterization problems in mathematical statistics. Wiley, 1973. [9] A. Yeredor. Blind source separation using the second derivative of the second characteristic function. In ICASSP 00, pages 3136–3139, Istanbul (Turkey), May 2000.