Cumulant-based independence measures for ... - Semantic Scholar

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 47, NO. 5, JULY 2001

1947

Cumulant-Based Independence Measures for Linear Mixtures Jean-Christophe Pesquet, Senior Member, IEEE, and Eric Moreau, Associate Member, IEEE

Abstract—This paper deals with independence measures for linear mixtures of mutually independent random variables. Such measures, also known as contrasts, constitute useful criteria in solving blind source separation problems. By making use of the Schur convexity properties, we show that it is possible to define a wide-ranging class of contrast functions based on the auto-cumulants of the components of the random vector being considered. Among the most appealing characteristics of these new contrast functions is that they can be used to combine cumulants of different orders in a flexible way. Furthermore, extensions of existing cross-cumulant-based contrasts are proposed. Finally, some particularization of our approach to measures of decorrelation is considered. A general characterization of these decorrelation measures using strictly Schur convex functions is provided. Index Terms—Contrasts, convex functions, cumulants, decorrelation, higher order statistics, independence, majorization, source separation.

I. INTRODUCTION

I

N many signal processing problems, it is useful to know whether the components of a random vector are mutually independent. If they are not, it is often desired to make them independent when this can be achieved. This operation is sometimes called Independent Component Analysis (ICA)[20], [9]. To give a few examples, in the field of source coding, finding a low-complexity transform which will decorrelate the data to be compressed as much as possible is often the first step in the design of a coding method [19], [18]. Except in certain specific cases, however, decorrelation is not sufficient to guarantee independence. In the context of blind source separation [20], [21], [4], [39], [9], [11], [24], [27], [28], [15], [33], [16], [23], [12], [5], [13], [31], much effort has been devoted to recovering independent components mixed by an unknown linear transform. This problem has found numerous applications in various fields of engineering, e.g., array processing, data communications, speech processing, and seismic exploration. Another recent domain of interest is the search for a “best basis” adapted Manuscript received September 5, 1999; revised June 14, 2000. The material in this paper was presented in part at the IEEE International Conference on Acoustics, Speech, and Signal Processing, Munich, Germany, April 1997. J.-C. Pesquet is with the Laboratoire des Signaux et Systèmes, Université de Marne-la-Vallée, Cité Descartes, Champs sur Marne, F-77454 Marne la Vallée Cedex 2, France (e-mail: [email protected]). E. Moreau is with SIS ISITV, Université de Toulon et du Var, F-83162 La Valette du Var Cedex, France (e-mail: [email protected]). Communicated by U. Madhow, Associate Editor for Detection and Estimation. Publisher Item Identifier S 0018-9448(01)04427-3.

to the analysis of a given stochastic process [2], [8], [30], [34]. The problem is then to determine within a dictionary of bases the ones which lead to the most uncorrelated/independent components. For all these tasks, measures of independence can be optimized. Essentially, these reach their maximum values when the components of interest become mutually independent. The concept of contrast was introduced by Comon [9] to formalize this idea. Traditionally, the measurement of independence is based on the Kullback–Leibler “distance” between the probability distribution of the considered random vector and the product of its marginal distributions [9], [34]. Although optimal from a number of points of view, the use of the Kullback–Leibler divergence may be computationally expensive as it requires (at least) the empirical estimation of the marginal distributions. One way of circumventing this difficulty is to assume a parametric form for these distributions [33]. In this paper, however, we will only be concerned in nonparametric approaches requiring very little prior statistical knowledge. Criteria based on cumulants are popular for their simplicity, especially for source separation. These criteria may either be functions of auto-cumulants [9], [11], [27], [28], [31] or depend on cross-cumulants [21], [4], [15], [16], [5]. While other interesting criteria may be considered [9], [13], these can often be approximated by a cumulant-based independence measure after some polynomial expansion of the nonlinear functions involved in the expressions of these criteria is performed. It must be noticed that similar criteria using high-order statistics have been employed for blind deconvolution of linear time-invariant systems, both in the monovariate (see, e.g., [43], [17], [35], [40], [7]) and multivariate cases (see, e.g., [37], [44], [41], [6], [10], [29]). In this study, we will mostly restrict our attention to random vectors resulting from an orthonormal linear mixture of independent components. In source separation problems, the orthonormality assumption is valid for linear mixtures provided that a prewhitening of the observation vector is performed. In the best basis search problems, the assumption is valid if there exists an orthonormal basis (belonging to the considered dictionary) where the components of the analyzed process are independent. For natural signals, it is clear that this assumption is idealistic. Sometimes, however, synthetic signals can be produced which satisfy this property (for example, signals resulting from orthonormal transmultiplexing/scrambling of independent data). Our objective in this paper is to provide a generalization of existing cumulant-based contrast functions. This generalization

0018–9448/01$10.00 © 2001 IEEE

1948


borrows some elements from the theory of majorization (see [25] and references therein) which was introduced by Hardy, Littlewood, and Pólya in the 1930s. Note that this theory was recently shown to be of interest in other types of signal processing problems [14]. The interest of our results is threefold. First, they establish some fundamental connections between cumulant-based contrasts and the class of Schur convex functions. Secondly, they bring together some of the existing results in a unifying framework and provide both simple and elegant proofs for them. Finally, they provide us some flexibility in the choice of a contrast function. In particular, rather general rules are given for combining cumulants of different orders in such a criterion. The remainder of the paper is organized as follows. In Section II, we list some relevant definitions along with the main notation and assumptions used within this study. Section III constitutes the central part of the paper, and shows how to obtain a contrast as a Schur convex function of a number of auto-cumulants. Special cases are further discussed to illustrate the generality of this result. Some complementary topics are addressed in Section IV. In Section IV-A, connections with contrasts based on cross-cumulants are investigated. For the sake of simplicity, we concentrate on multipolynomial functions of these cross-cumulants. In Section IV-B, results similar to those of Section III are derived for decorrelation measures (also called semicontrasts). A necessary and sufficient condition for a function of the component variances to be such a measure is given. Finally, conclusions are drawn in Section V.

vectors satisfying the above assumption.2 In the above context, the general independence problem consists in determining a , such linear transform operating on the observation vector , , are statistically indethat its outputs pendent. This transformation is given by (2) is the output vector and where is a matrix to be determined. Independence is then achieved if and only if (iff) the global transform matrix defined by (3) satisfies the so-called independence property [9]: (4) where is an invertible diagonal matrix and matrix. In this work, we focus on global mixtures

a permutation

such that

is a doubly stochastic matrix, i.e., (5)

II. PROBLEM STATEMENT We consider a linear mixture of real random , , called sources.1 Our signals study is here restricted to the real case although our results may , be easily extended to the complex one. At a given time the linear input–output relation of the (purely spatial) mixing system is (1) is the invertible matrix characterizing the mixture, the observation vector and the vector of sources which satisfies the following assumption: . For all , the components of are statistically , mutually independent and they satisfy . Furthermore, for all where and , the th-order cumulant of where

is independent of and will be denoted by . For more details about high-order cumulants and their properties, see references such as [26], [32], [3], [38]. More specifically, we will be interested in some subset of the set of random 1The symbols , , , , and designate the set of integer numbers, positive integer numbers, real numbers, nonnegative real numbers, and positive real numbers, respectively.

(6) matrices. Let us denote by the corresponding set of Note that a diagonal matrix belongs to iff its entries are equal . The set of matrices of the form (4), is denoted to by . From a practical point of view, see e.g., [9], an important subset of is the set of orthonormal (unitary) matrices which will be denoted by . Note that it is relatively simple to constrain to lie in by choosing in , when also belongs to . then the set of random vectors built Let and is denoted from (2), (1), and (3) where . by From now on, the explicit dependence of random vectors on discrete time will be omitted when no confusion is possible. Since we wish to obtain a statistically independent vector , a measure of independence is needed. Such measures have been introduced by Comon [9] and called contrast functions or simply contrasts. from Definition 1: A contrast is a multivariate mapping to which satisfies the following three rethe set quirements:

2Such a subset typically corresponds to sources such that some of their cumulants are nonzero.

PESQUET AND MOREAU: CUMULANT-BASED INDEPENDENCE MEASURES FOR LINEAR MIXTURES

According to the definition, a contrast is invariant under any sign change of its components which play symmetric roles and it must be maximized to get independence. In the following, con” or “ -contrast will be called “contrast on trast” in order to specify the set of random vectors considered. is a -contrast, a necessary and sufficient condiIf tion for statistical independence of a random vector built from a linear transform (in ) of a vector with independent com. This is the reason why the resolution ponents is of blind source separation problems is often called Independent Component Analysis (ICA). and be defined as the set of sources Let with at most one of their cumulants of order , satisfying , , equal to zero. A first -contrast was proposed in [9]. Its expression is given by

1949

we have

An increasing function is satisfied, we have

is strictly increasing if, when (10)

iff Let and . The set of is said to be majorized by vectors if there exists a doubly stochastic matrix such that The function all

as defined in (9) is Schur-convex if, for and such that is majorized by , we have

(7) will be clarified later. When conwhere the superscript ), the above considering orthonormal mixtures (i.e., trast has an interesting interpretation. Indeed, its maximization is equivalent to the minimization of the sum of the squares of all cross-cumulants of the same order . These cumulants are actually measures of statistical dependence (at order ). This point will be further discussed in Section IV-A. Later [27], [28], it was shown that squaring the cumulants in -contrast contrast (7) is not necessary. The following was then proposed: (8) It is interesting to note that there exists a common point between the two contrasts. They are indeed the sums of certain convex or , of the absolute values of th-order cumufunctions, lants. In the next section, a generalization of the above measures of independence is proposed. It is based on a convexity property. In particular, this will allow us to bring together contrasts and , as well as some new contrast functions, into a unifying framework. The developments in the next section rely on multivariate majorization results. We now recall some important definitions which will be used subsequently. Further insight into the theory of majorization can be found in [25]. and be a subset of . A function Let (9) is symmetric if, for all tation matrices such that

in

Let and . We will write when of is increasing if, for all The function and with

and for any permu, we have

be two vectors , .

To be strictly Schur-convex, a function and convex and, for all such that is majorized by have

iff there exists a permutation matrix

must be Schur, we must

such that

It must be emphasized that the class of Schur-convex (resp., strictly Schur-convex) functions includes functions which, in each of their arguments in , are symmetric and convex (resp., strictly convex). Another class of Schur-convex (resp., strictly Schur-convex) functions contains symmetric convex (resp., . Moreover, Schur-convex strictly convex) functions on functions are themselves necessarily symmetric. III. A GENERALIZED FORM OF CONTRAST FUNCTIONS A. Main Result , , and , is a subset of sources satisfying with . As previously mentioned, contrasts and with , are sums of convex functions of absolute values of cumulants. This suggests the following definition:

Let where

(11) where and is Schur-convex. The following lemma is the first step toward proving that, is a contrast. Its under appropriate assumptions, proof is given in full in Appendix I. is an increasing Schur-convex function, Lemma 1: If the inequality this implies that for any vector (12)

(10)

holds.

1950


This preliminary result means that, subject to the assump, the function satisfies Requiretions made on of a contrast function. In order to establish , slightly ment stronger assumptions are necessary. and the folIn particular, is subsequently defined by lowing supplementary condition: . there exists at least values of such that (13) which will allow us to state the main result of this section. is a strictly increasing Schur-convex Proposition 1: If is a -contrast. function, the function is obviously fulfilled since any Proof: Requirement follows Schur-convex function is symmetric. Requirement is strictly increasing, if from Lemma 1. Furthermore, as , equality holds in (12) and hence in (34), then

(15) -contrast. as defined in (15) is a Proof: One easily checks that strictly increasing Schur-convex function.

is a

Another possibility is to decompose as a sum of functions of each component of the observation vector. In the remainder of this paper, we will be mainly interested in the following contrast functions. be a strictly increasing defined by

Proposition 2: Let convex function. The function

satisfying (13), we

(16) -contrast. as defined in (16) is clearly symProof: The function metric, strictly increasing, and convex. As already mentioned at the end of Section II, a symmetric convex function is Schurconvex. Thus, the result follows from Proposition 1.

(14)

Simple contrast functions can be built by choosing a monoin (16). A possible choice is variate function

is a

Since

we deduce that, for all obtain

Corollary 1: For all , let be a Schur-convex function which is strictly increasing. The function defined by

, this leads to or . By using the fact that has been assumed doubly stochastic and relation (14) is As values of the column index , we satisfied for at least is a permutation matrix. This allows to can easily see that conclude that the separation property (4) is satisfied. Notes: is obviously violated when more than one a) Condition source is Gaussian. This condition also prohibits us from only using odd-order cumulants for more than one source with symmetric probability distributions. b) The above result allows us to combine cumulants of different orders to build contrasts. This leads to an improved robustness with respect to (w.r.t.) variations of the statistics of the sources.3 In particular, second order moments can be used in combination with higher order statistics. c) Proposition 1 provides a sufficient but not necessary condition for a function to be a contrast. Examples of other contrasts will be given in Section IV.

(17) , , and . We note that when , the cor-contrast reduces to the contrast in (7) and, responding , (8) is obtained. when Multivariate functions can also be considered in order to make use of cumulants of different orders. The following result may be useful in order to design such functions.

with

, let

Proposition 3: Given

be a matrix with nonnegative elements such that (18) A function satisfying the assumptions of Proposition 2 is then

B. Some Examples Contrast functions can be built by decomposing the function in (11) into the sum of simpler functions. If one considers sums of individual functions of different order cumulants, we have the following. 3It is relatively easy to consider parametric sets of sources whose cumulants of a given order vanish for some values of the parameters [44].

(19) where

,

, and (20)


The proof of this proposition is based on elementary properties of increasing/convex functions. For completeness, it is provided in Appendix II. is Schur-convex iff By noting that is Schur-convex, contrast functions may also be obtained using multiplicative decompositions. Thus, under the additional as, is a positive funcsumption that, for all tion, Corollary 1 remains valid when the summation in (15) is replaced by a product. Similarly, Proposition 2 allows us to deduce the following. be a strictly increasing funcis convex. The function de-

Corollary 2: Let tion such that fined by

(21) is a

-contrast. IV. OTHER EXTENSIONS OF CONTRAST

1951

. , found in the expressions of both , Then, we have

, and index and

may be .

(23)

The rationale of this lemma is the following. Each term of the summation in (22) is the product of cumulant functions, each of order . Condition is introduced to guarantee that any , appears twice in each of these index , terms. Furthermore, it ensures that at least one common index may be found in the expression of the th and th cumulant . Note also that, according functions, for is invariant with respect to any permutation of the to (23), when orthogonal mixtures of independent indexes components are considered. We can now prove the following result, using the same notation as in Lemma 2. and

Proposition 4: Let

such that4

A. Connections with Criteria Based on Cross-Cumulants Because of the properties of cumulants, it may appear more natural to base the construction of measures of independence on cross-cumulants rather than (auto-)cumulants. As will be shown in this section, however, this would lead to less tractable criteria. Furthermore, it is sometimes possible to find an equivalence between the two kinds of criteria. In this section, we restrict our attention to the case of or. As in the previous section, thonormal matrices denote the orders of the considered , and the sources in satisfy cumulants where with . We first need to state the following lemma whose proof is reproduced in Appendix III:

Let .

be defined by and there exists at most one

Define the function

by

where and order , which satisfies

and

such that

and assume that

Lemma 2: Let

Define the function

by

is a cross-cumulant function of

(22) where of cumulant of order

,

is an indexing vector , is a

and, for all of the form

where, for all

and

(24) is a -contrast. Then, Proof: According to Lemma 2, have

and

, we

,

(25)

is such that

Let be the number of times index the summation. If

appears in each term of

, are even and at least Since the numbers , one of them is greater than or equal to , it is shown in Appendix is a -contrast. IV that 4The

set of even positive integer numbers is denoted by 2

.

1952


As proposed in [15], [16], a first example of a contrast within , , ) is this class ( (26) ,

A second example [4] (

,

As is a -contrast, is negative or zero and only vanishes when the independence property holds. In conwhich can be easily verified, this yields the junction with desired property. It must be noted that the above proposition is an extension of a result in [9], where it was established the equivalence between , and

) is (27)

Hence, Proposition 4 gives a generalization of these two results. Another contrast is

(28) As a result of (25), contrasts (27) and (28) take exactly the same values for orthogonal mixtures of independent components. A second class of contrasts based on cross-cumulants may be equal to derived from Lemma 2 by setting all the variables . , , and Proposition 5: Let . Let be defined by and . there exists at most one such that . be the -contrast which is given by Let (29) -contrast based on cross-cumulants is equiva-

Then, a lently5

where

and

satisfying

is defined as in Lemma 2 with (30)

, Proof: The fact that from Proposition 2. If (30) that

and

. is a -contrast follows , we deduce from (23) and

where

with

,

, and

Many other contrasts may be constructed by using Proposiis also equivalent to tion 5. For instance,

where

and

B. Measures of Decorrelation Propositions 1, 2, 4, and 5 prescribe the use of high-order statistics to build contrast functions. Second-order moments may, however, be sufficient in certain specific situations. To formalize the concept of measure of decorrelation, we introduce some assumptions and definitions. The source vector is now assumed to satisfy . The sources , , are with finite variances, (wide-sense) stationary and statistically mutually uncorrelated. is clearly more restrictive than . Let Note that denote the set of random vectors of sources satisfying the then corresponds to the above assumption. The set of second-order real random vectors since any set such vector can be considered as an orthogonal mixture of uncorrelated sources (possibly with different variances) when it is decomposed in a Karhunen–Loève basis.6 On the other hand, it is not always possible to express a vector as an orthogonal mixture of independent sources. The measures of decorrelation introduced in this section will be called semicontrast functions or simply semicontrasts as one may consider that “half” of the task of source separation has been performed when decorrelation is realized (cf. [9], [4]). Definition 2: A semicontrast is a multivariate mapping from the set to which satisfies requirement and

(31) 5Two

Y (U ; A)

-contrast functions I (1) and I (1) are said to be equivalent if there exists a function g (1) defined on A such that 8 S 2 U and 8 a 2 A, I (S a) = I (S a) + g (a).

A characterization of semicontrast functions is provided by the following proposition whose proof is given in Appendix V. 6This

amounts to performing a PCA.


Proposition 6: Let where is a functo and is the vector of component tion from . A necessary and sufficient convariances of to be a semicontrast is that is a strictly dition for Schur-convex function. A semicontrast of the above form is obtained by setting

where is a strictly convex function from to . This latter result was already obtained in [42]. Similarly to Corollary 2, another example of a semicontrast is given by

where is a function from strictly convex. As an example

to

such that

prove the main results in this paper rely on conditions (5) and (6). It would be interesting to see how these assumptions can be relaxed. Finally, it would be useful to consider extensions of our results to the multivariate convolutive case [37], [44], [41], [1], [6]. As shown by some preliminary results in [10], [29], [36] this should not create major difficulties. APPENDIX I PROOF OF LEMMA 1 Equations (5) and (6) imply that, for all , . As a consequence, the proper, ties of cumulants allow us to show that

is

corresponds to a semicontrast. When has nonzero comand the inverse of the geometric ponents, we can let mean of the component variances is obtained. Such a decorrelation measure is often used for lossy compression as it allows the definition of the coding gain [19], [18] in transform coding. In general, decorrelation is a necessary but not sufficient condition for independence. However, a semicontrast is a contrast when is the set of random vectors satisfying on with , and whose components have different variances. It must be pointed out that this context is rather restrictive. In many practical situations, some of the sources do have the same variance and can even have the same probability distribution. A further property can also be stated which is a direct consequence of Definitions 1 and 2. If

1953

(32) For all ponent of vector

In this paper, we have defined a wide class of independence measures, which includes many existing contrasts as well as new ones. Incidentally, we have also proposed a general form for decorrelation measures based on the variances of the components of the considered random vectors. In source separation, contrasts allow us to derive estimators for the unknown transformation matrix and the sources to be recovered. It is particularly interesting to study the statistical performances of the estimators corresponding to these new criteria (or a subclass of them) when empirical estimations of the cumulants must be made. The robustness of the source estimators must also be evaluated in order to quantify the benefits which can be drawn from our generalizations better. These statistical problems can be addressed in a future work. A number of further problems of interest could be investigated. The majorization techniques which have allowed us to

is the th comdefined as (33)

As

is increasing, inequality (32) yields

(34) in (33) being doubly stochastic, Furthermore, the matrix is majorized by . By using , we obtain the folthe assumption of Schur-convexity of lowing inequality: (35) By combining (34) and (35), the desired result is obtained.

Proposition 7: Let be a set of source vectors satisfying . is a -contrast and is a semicontrast, then is a -contrast. V. CONCLUSION

,

APPENDIX II PROOF OF PROPOSITION 3 We first state the following lemma whose proof is simple and does not need to be reproduced here. . For all , Lemma 3: Let and , , be increasing convex let to and be matrices with functions from such that, for all nonnegative elements. The function

where is also an increasing convex function. Proposition 3 appears to be a direct consequence of this are lemma. Indeed, the properties required for function obtained by setting (36)

1954

when


, and (37)

.. .

.. .

is independent of (since , ). Summing over and making use of the orthogonality of lead to

(38)

(39)

(40) (41) . The convexity of in (39) is deduced from the when . Note that (20) is necestriangular inequality for norm has convex components in (40). Fursary to guarantee that can be shown to be strictly increasing by thermore, function calculating its gradients w.r.t. each of its variables.

By substituting the resulting term in (43), we conclude that (23) is satisfied.

PROOF THAT

APPENDIX IV IN (25) IS A CONTRAST

Function

in (25) takes the following form:

By noting that

,

APPENDIX III PROOF OF LEMMA 2 Without loss of generality, Condition that

allows us to assume

According to properties of cumulants, one can express

(42) as we have (44)

(43) By using (42), we find that

is satisfied. This shows that Requirement Without any loss of generality, we can re-express

as (45) (46)

Now equality holds in (44) only if where, by convention, the product over reduces to if This expression can be rewritten as

. (47) According to (24), there exists . Then, it is clear that that

where,

, say

, such

only if column has only one nonzero component that is . Thus, equalities (47) are achieved if each of the first columns of has only one nonzero component equal to . Since is an orthogonal matrix, its row (resp., column) vectors


must be further orthogonal and we can conclude that is also satisfied and is a contrast. Thus,

.

APPENDIX V PROOF OF PROPOSITION 6 Let

be a vector such that where . As has uncorrelated components, we have

Since is a doubly stochastic matrix, . By using the Schur-convexity of

and

is majorized by , we conclude that (48)

is a Schur-convex function, We have thus proved that, if is fulfilled. Requirement By now, using the assumption of strict Schur-convexity of , we see that equality arises in (48) iff there exists a permutation matrix such that (49) Furthermore, as

, we have

which combined with (49) leads to

This clearly shows that is uncorrelated. In conclusion, the implies that Requirement is strict Schur-convexity of is also obviously satisfied, is a measure satisfied. As of decorrelation. We will now prove the reciprocal statement. For all such that is majorized by , it can be proved [25] that such that there exists an orthostochastic matrix This means that there is an orthonormal matrix such that . Furthermore, one can always find a random vector such that . Let (50) . If is a semicontrast, we we then obtain that . This proves that is deduce from , we deduce from Schur-convex. If, moreover, that . As is related to by (50), is equal to , up to some possible permutation. Consequently, a necto be a semicontrast is that is essary condition for strictly Schur-convex. REFERENCES [1] K. Abed-Merain, P. Loubaton, and E. Moulines, “A subspace algorithm for certain blind identification problems,” IEEE Trans. Inform. Theory, vol. 43, pp. 499–511, Mar. 1997. [2] J. B. Buckeit and D. L. Donoho, “Time-frequency tilings which best expose the non-Gaussian behavior of a stochastic process,” in Proc. IEEE Symp. Time-Frequency and Time-Scale Analysis (TFTS’96), Paris, France, June 18–21, 1996, pp. 1–4. [3] D. R. Brillinger, “Some basic aspects and uses of higher-order spectra,” Signal Processing, vol. 36, pp. 239–249, 1994.

1955

[4] J. F. Cardoso and A. Souloumiac, “Blind beamforming for non Gaussian signals,” Proc. Inst. Elec. Eng., Pt. F, vol. 40, pp. 362–370, 1993. [5] J. F. Cardoso, “High-order contrasts for independent component analysis,” Neural Comput., vol. 11, pp. 157–192, 1999. [6] L. Castedo, C. J. Escudero, and A. Dapena, “A blind signal separation method for multiuser communications,” IEEE Trans. Signal Processing, vol. 45, pp. 1343–1348, May 1997. [7] C.-H. Chen, C.-Y. Chi, and W.-T. Chen, “New cumulant-based inverse filter criteria for deconvolution of nonminimum phase systems,” IEEE Trans. Signal Processing, vol. 44, pp. 1292–1297, 1996. [8] R. R. Coifman and N. Saito, “The local Karhunen-Loève bases,” in Proc. IEEE Symp. Time-Frequency and Time-Scale Analysis (TFTS’96), Paris, France, June 18–21, 1996, pp. 129–132. [9] P. Comon, “Independent component analysis, a new concept?,” Signal Processing, vol. 36, pp. 287–314, 1994. , “Contrasts for multichannel blind deconvolution,” IEEE Signal [10] Processing Lett., vol. 3, pp. 209–211, July 1996. [11] N. Delfosse and P. Loubaton, “Adaptive blind separation of independent sources: A deflation approach,” Signal Processing, vol. 45, pp. 59–83, 1995. [12] F. Gamboa and E. Gassiat, “Source separation when the input sources are discrete or have constant modulus,” IEEE Trans. Signal Processing, vol. 45, pp. 3062–3072, Dec. 1997. [13] A. Hyvärinen, “Gaussian moments for noisy independent component analysis,” IEEE Signal Processing Lett., vol. 6, pp. 145–147, June 1999. [14] H. Krim, “On the distribution of optimized multiscale representations,” in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP’97), vol. 5, Munich, Germany, Apr. 1997, pp. 3473–3476. [15] L. De Lathauwer, B. De Moor, and J. Vandewalle, “Blind source separation by simultaneous third-order tensor diagonalization,” in Proc. European Signal Processing Conf. (EUSIPCO’96), Trieste, Italy, Sept. 1996, pp. 2089–2092. [16] L. De Lathauwer, “Signal processing based on multilinear algebra,” Ph.D. dissertation, K. U. Leuven, Leuven, Belgium, Sept. 1997. [17] D. Donoho, “On minimum entropy deconvolution,” in Applied Time Series Analysis II, D. Findely, Ed. New York: Academic, 1981, pp. 565–608. [18] A. Gersho and R. M. Gray, Vector Quantization and Signal Compression. Boston, MA: Kluwer , 1991. [19] N. S. Jayant and P. Noll, Digital Coding of Waveforms. Englewood Cliffs, NJ: Prentice-Hall, 1984. [20] C. Jutten and J. Herault, “Blind separation of sources—Part I: An adaptative algorithm based on neuromimetic architecture,” Signal Processing, vol. 24, pp. 1–10, 1991. [21] J. L. Lacoume and P. Ruiz, “Separation of independent sources from correlated inputs,” IEEE Trans. Signal Processing, vol. 40, no. 12, pp. 3074–3078, Dec. 1992. [22] Y (G.) Li and K. J. R. Liu, “Adaptive blind source separation and equalization for multiple-input–multiple output systems,” IEEE Trans. Inform. Theory, vol. 44, pp. 2864–2876, Nov. 1998. [23] O. Macchi and E. Moreau, “Self-adaptive source separation—Part I: Convergence analysis of a direct linear network controled by the Hèrault–Jutten algorithm,” IEEE Trans. Signal Processing, vol. 45, pp. 918–926, Apr. 1997. [24] A. Mansour and C. Jutten, “Fouth-order criteria for blind source separation,” IEEE Trans. Signal Processing, vol. 43, pp. 2022–2025, Aug. 1995. [25] A. W. Marshall and I. Olkin, “Inequalities: Theory of majorization and its applications,” in Mathematics In Science and Engineering. San Diego, CA: Academic, 1979. [26] J. Mendel, “Tutorial on higher-order statistics (spectra) in signal processing and system theory: Theoretical results and some applications,” Proc. IEEE, vol. 73, pp. 278–305, Mar. 1991. [27] E. Moreau and O. Macchi, “High order contrasts for self-adaptive source separation,” Int. J. Adaptive Contr. Signal Processing, vol. 10, no. 1, pp. 19–46, Jan. 1996. [28] E. Moreau, “Criteria for complex sources separation,” in Proc. European Signal Processing Conf. (EUSIPCO’96), vol. II, Trieste, Italy, Sept. 1996, pp. 931–934. [29] E. Moreau and J.-C. Pesquet, “Generalized contrasts for multichannel blind deconvolution of linear systems,” IEEE Signal Processing Lett., vol. 4, pp. 182–183, June 1997. , “Independence/decorrelation measures with applications to [30] optimized orthonormal representations,” in Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP’97), vol. 5, Munich, Germany, Apr. 1997, pp. 3453–3456.

1956

[31] E. Moreau and N. Thirion-Moreau, “Nonsymmetrical contrasts for source separation,” IEEE Trans. Signal Processing, vol. 47, pp. 2241–2252, Aug. 1999. [32] C. L. Nikias and A. P. Petropulu, Higher-Order Spectra Analysis. A Nonlinear Signal Processing Framework, ser. Oppenheim Series in Signal Processing. Englewood Cliffs, NJ: Prentice-Hall, 1993. [33] D. T. Pham, “Blind separation of instantaneous mixture of sources via an independent component analysis,” IEEE Trans. Signal Processing, vol. 44, pp. 2768–2779, Nov. 1996. [34] N. Saito, “The least statistically-dependent basis and its applications,” in Proc. 32nd Asilomar Conf. Signals, Systems and Computers, Pacific Grove, CA, Nov. 1–4, 1998, pp. 732–736. [35] O. Shalvi and E. Weinstein, “New criteria for blind deconvolution of nonminimum phase systems (channels),” IEEE Trans. Inform. Theory, vol. 36, pp. 312–321, Mar. 1990. [36] C. Simon, C. Vignat, P. Loubaton, C. Jutten, and G. d’Urso, “Separation of a class of convolutive mixtures: A contrast function approach,” IEEE Signal Processing Lett., submitted for publication. [37] A. Swami, G. Giannakis, and S. Shamsunder, “Multichannel ARMA processes,” IEEE Trans. Signal Processing, vol. 42, pp. 898–913, Apr. 1994. [38] A. Swami, G. Giannakis, and G. Zhou, “Bibliography on higher-order statistics,” Signal Processing, vol. 60, pp. 65–126, 1997.


[39] L. Tong, Y. Inouye, and R. Liu, “Waveform-preserving blind estimation of multiple independent sources,” IEEE Trans. Signal Processing, vol. 41, pp. 2461–2470, July 1993. [40] J. K. Tugnait, “Estimation of linear parametric models using inverse filter criteria and higher order statistics,” IEEE Trans. Signal Processing, vol. 41, pp. 3196–3199, Nov. 1993. [41] , “Identification and deconvolution of multichannel linear non-Gaussian processes using higher order statistics and inverse filter criteria,” IEEE Trans. Signal Processing, vol. 45, pp. 658–672, Mar. 1997. [42] S. Watanabe, “Karhunen–Loève expansion and factor analysis. Theoretical remarks and applications,” in Trans. 4th Prague Conf. Information Theory, Statistical Decision Functions Random Processes. Prague, Czechoslovakia: Pub. House Czechoslovak Acad. Sci., 1965, pp. 635–660. [43] R. A. Wiggins, “Minimum entropy deconvolution,” Geoexploration, vol. 16, pp. 21–35, 1978. [44] D. Yellin and E. Weinstein, “Multichannel signal separation: Methods and analysis,” IEEE Trans. Signal Processing, vol. 44, pp. 106–118, Jan. 1996.