The 22nd Iranian Conference on Electrical Engineering (ICEE 2014), May 20-22, 2014, Shahid Beheshti University
Single Channel Source Separation Using NonGaussian NMF and Modified Hilbert Spectrum Seyyed Reza Sharafinezhad Department of ECE University of Tehran Tehran, Iran
[email protected]
Habib Alizadeh Department of ECE Tarbiat modares University Tehran, Iran
[email protected]
AbstractβIn this paper, a new and powerful method for Blind source separation for single channel mixtures in the noisy environment is presented. This method is based on nonnegative matrix factorization in which modified Hilbert spectrum is employed. In the proposed algorithm, a modified EEMD is offered to transfer the signal to the special intrinsic mode functions (IMF). We used the local spectrums (LoMS) as artificial observations. In order to make estimated spectrum using NMF, the maximization of non-Gaussianity is used. The simulation result indicates that the proposed method improves the quality of reconstruction better than traditional NMF and decreases the time consumption time, compare to more than NMF2D. Keywords-Blind Source Separation, factorization, Hilbert spectrum
I.
nonnegative
matrix
INTRODUCTION
Single-Channel Source Separation (SCSS) is one of the important aspects in the signal processing area which has widespread applications in audio processing, wireless communications, biomedical signal processing, radar signal processing, and image processing and so on [1]. However, SCSS is one of the challenging issues in the signal processing field because of finding two or some source signals from only one mixture. Different methods are available to separating the signals from a single mixture. In general, SCSS can be divided into two main groups of non-blind and blind methods. The method which is based on additional information and priori distributions of sources are called non-blind single channel source separation. However, another group of methods is not based on additional information of sources; it is only relied on partial assumptions; this method is so-called to blind single channel source separation. The conventional methods of SCSS are singular spectrum analysis [2], Hilbert spectrum based methods [3] and Nonnegative Matrix Factorization (NMF) based methods [4]. These methods are used in the time, frequency or time frequency domain. In this method, basic assumptions are considered to mixture of sources. In the SSA based methods, time series of the single channel signal is converted to the trajectory matrix. Then using the Eigen value decomposition method, this matrix is divided into some sub-group matrixes. In the last stage, this
978-1-4799-4409-5/14/$31.00 Β©2014 IEEE
1673
Mohammad Eshghi Department of ECE Shahid Beheshti University Tehran, Iran
[email protected]
matrixes is returned into the vectors with special assumption. Problems of this method, existence of parameters that are selected regulatory and there are restrictions on the type of source signals, such as the stationary and independence of sources [2]. Hilbert spectrum directly [3] or indirectly [5] is used to separate sources. This method is based on the use of oscillation mode function. Empirical mode decomposition disintegrates the signal into oscillation modes called instantaneous mode function (IMF). Each IMF satisfies two conditions, the number of zero crossing points must be equal to the number of extrema point and the mean value of upper envelop and lower envelop are close to zero at any point. Using transfer of IMFs into the time - frequency domain, the Hilbert spectrum is obtained. However, EMD method is sensitive to noise. This method requires signals that have the oscillation mode, and the amplitude of signals should not change at any moment strongly. Also, this method has mode mixing problem [3]. To solve these problems, the method based on adding several Gaussian noises to the signal and decomposes the signal to the oscillating mode and the averaging of IMF are presented. This method called ensemble empirical mode decomposition (EEMD) that is improved version of the empirical mode decomposition (EMD) method [6]. However, to calculate the number of added noise and its amplitude are not introduced any solution. Time consuming of EEMD is much longer than the EMD. Among non-blind and blind methods, NMF algorithm is the simplest method to separate two or more sources that Paatero introduces it in 1994 [7]. The most important issue in relation to NMF is the calculation of non-negative matrices π and π». Unknown matrices π and π» must be calculated so that the cost function is minimized. How to minimize the cost function is the main discussion in scientific Literature and is still one of the open issues in this field. However, the total effort that can be done in this field has been classified as follows: A group of algorithms try to change or improve the cost function. After introducing the standard NMF algorithm by Lee and Seung [8], many scientific papers are published in the analysis area, development and applications of NMF algorithms in various fields of science, engineering and
biomedicine [9], [10]. This group of NMF algorithms is presented by several authors with different mathematical formulas in the time, frequency and time - frequency domain. Another group of algorithms are based to search of appropriate methods to initialize matrixes π and π» to speed up to convergence of cost function [11]. However, any of these methods is not presented π and π» that be uniqueness and include same latent sources. In order to find a solution to the mentioned problems in the deferent SCBSS methods, we proposed the algorithm that is based on a modified Hilbert spectrum and non-Gaussian NMF. The key component of the proposed algorithm is to decompose the non-stationary signal into the total of stationary signals in timeβfrequency domain by modified Hilbert spectrum and maximize the non-Gaussianity of the obtained spectrum in NMF method. The rest of paper is organized as follows: In section II, a brief background on Hilbert spectrum and NMF methods are presented. In section III, the main idea of our BSS algorithm is described. In section IV the results of a computer simulation and comparison are shown. Section V and VI are the conclusion and the references, respectively. II.
BACKGROUND
Since concepts of Hilbert spectrum and nonnegative matrix factorization (NMF) are the basis of our proposed algorithm, brief descriptions of them are presented in section A and B, respectively.
ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο π =
Μ (π‘) ππ ππ‘
ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο¨ο΄ο©ο
In which πΜ is the unwrapped of instantaneous phase. After calculating the instantaneous frequency and amplitude of IMFs, the Hilbert spectrum of signal in each time (π‘ = 1, β¦ , π ) and frequency bin ( π = 1, β¦ , π΅ ) is obtained as equation (5). π ο π(π, π‘) = βπ π=1 ππ (π‘)πΎπ (π‘) ,ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο¨ο΅ο©ο ο ο
The coefficient πΎππ is equal to 1 if ππ‘β IMF is within the l band, otherwise it is 0. Therefore, the Hilbert spectrum is defined as the weighted sum of the amplitudes of all oscillation modes. th
B. Nonnegative Matrix Factorization Nonnegative Matrix Factorization is a method which is used for decompose a nonnegative matrix X into two nonnegative matrix W and H [4]. ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο π β ππ»ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο¨οΆο© ο Where π = [π₯ππ ] is a B by N dimensional spectrum matrix, π = [π€ππ ]πβπ΅Γπ and π» = [βππ‘ ]πβπΓπ are base matrix and gain matrix respectively. P parameter is chosen so that the following inequality is kept: (π΅ + π) Γ π < π΅ Γ π
A. Hilbert Spectrum Hilbert Spectrum (HS) represents the distribution of the signal energy as a function of time and frequency [3], [5]. To construct HS, all intrinsic mode functions (IMF) are calculated using EMD method. The relationship between IMFs and original signal is as equation (1). ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο π₯(π‘) = βπ π=1 πΆπ (π‘) + ππ (π‘) ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο¨ο±ο© Where π is number of oscillation modes and πΆπ is the π intrinsic mode function and ππ is last remaining that is a tone signal. Instantaneous frequency (IF) represents the frequency of a signal at any time instance and is defined as the rate of phase changes at that instant, the instantaneous phase is stated in equation (2). π‘β
ππ (π‘) = arctan (
β[ππ (π‘)]
) , π = 1, β¦ , π
ππ (π‘)
ο ο ο ο ο¨ο²ο©ο
Where β[. ] indicates Hilbert transform of signals. The amplitude of IMFs in the each time is defined as equation (3). 1
ππ (π‘) = [ππ 2 (π‘) + (β[ππ (π‘)])2 ]2
(7)
In the blind speech separation, X can represent the timefrequency of observation signal x. Therefore, the number of rows and columns of the matrix X is equal to the number of frequency bins and the length of time-domain signal x respectively. In the NMF algorithm, weβre interested to estimate two matrices W and H so that its product has a minimum distance from the X matrix. There is conventional cost functions to minimize this distance. Its base on Euclidean distance or least squares. 1
π½πΏπ = π·πΏπ (π; ππ») = βπ β ππ»β22 ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο¨οΈο© 2
Where βπ β ππ»β22 is the Euclidean distance between X and WH product. Since any matrices, W and H can be existed to minimize the cost function. So, several constraints should be introduced into cost function based on source characters. Constraints are introduced so that the matrices W and H are uniquely determined. The cost function is developed by the constraints as follows: π½ = π·(π; ππ») = π·πΎπΏ (π; ππ») + πΌπΉ(π; ππ»)
(9)
Where πΌ the regularization parameter and F is the added constraint function. In this paper, we are introduced a constraint into NMF based on independence of source signals through minimizing gaussianity each of sources.
, π = 1, β¦ , π ο ο ο ο ο ο ο¨ο³ο©
Instantaneous frequency of a signal is computed as equation (4).
1674
III.
PROPOSED METHOD
Blind separation algorithm that is presented in this paper based on the separation of two or more source signals from a single observation. This model, called βsingle-channel source separationβ, is stated in equation (10). ο π₯(π‘) = βπ· π=1 ππ π π (π‘) + π(π‘)
πππ
In this paper a method are present to decompose observation signal into their sub-signals. The proposed method is based on Modified EEMD (M-EEMD) so that this algorithm is faster than EEMD and its IMFs are orthogonality than the traditional EMD. In EEMD [6], the relationship between the number of ensembles (N), added noise amplitude (π), and the standard deviation of error (πΏ) is stated in equation (11).
π‘ = 0,1, β¦ , π.ο ο ο ο ο¨ο±ο°ο©ο
In which π· is the number of resources that are latten in observation, π is length of observation signal or any of the source signals, n is the additive Gaussian noise signal and ππ is the coefficient of π π‘β source in the mixture.
πΏ=
π βπ
.
(11)
The main goal of our proposed algorithm is decomposition of the single observation into its IMFs using modified EEMD. Second, the modified Hilbert spectrum of any oscillation modes is obtained and then the local spectrums that are greater than a threshold value are selected as new observations.
The aim is to decrease the standard deviation of error (πΏ), by either increasing π or decreasing π. Increasing the number of ensembles N increases the computation time of the algorithm. On the other hand, the amplitude of the added noise (π) cannot be decreased from a threshold value. If a goes under this threshold value, the EMD cannot detects the extrema and EEMD cannot solves the mode mixing problem.
Any of selected local Hilbert spectrums (LHS) are considered as new observation in which the Gaussian noise is removed properly. Using non-Gaussian NMF presented in this paper, any of LHS is separated into two or more independent spectrum. These independent spectrums are returned to the time domain and they are classified into some groups according to the number of sources using KLD clustering. Signals of any groups add together and these added signals are an estimation of latent sources.
In order to find optimum values for N and π , an adaptive method is presented in this stage. In our proposed adaptive method, called M-EEMD, initial values for π and π are considered. Then, a primary value for standard deviation of error is calculated. The observation signal is decomposed into oscillation modes using EEMD with noise amplitude π and the number of ensemble noise π. The standard deviation of IMFs is then calculated using equation (12).
This separation method is based on NMF and maximization of non-Gaussianity and is presented for the first time. A. Modified Hilbert Spectrum As mentioned in Section II to obtain the Hilbert spectrum, first should be oscillation modes of signal are obtained using EMD. In comparison with other signal analysis methods, EMD method is only based on their signal and donβt need to signal parameters such as the window or the base signal. EMD acts as a dyadic filter bank with automatic band pass for a white noise. Therefore, the uniform Gaussian white noise can be separated from the original signal. However, when noise is non-uniformly combined to the signal or when the mixture of signals has the mode mixing problem, then the property of dyadic filter of EMD is failed. This problem not only can cause serious aliasing, but also causes unclear physical interpretations. This problem can be solved by adding a uniform white noise to the signal, several times. This method is called Ensemble Empirical Mode Decomposition (EEMD) [6]. Ensemble term indicates adding noise more than once. However, before EEMD is used, the amplitude of added noises and the number of ensemble noises are two parameters to be adjusted. In [6], the authors assume that the amplitude of the added white noise is about 0.2 times of maximum amplitude of the signal. They also consider that the number of ensemble noises is greater or equal 1000.
1675
1
1
1
π
π
πΏ
πΏ πΏ = βππ=1 [ βπ π=1 βπ=1|ππ (π‘π ) β πππ (π‘π )|] .
(12)
where πππ (. ) is the ith IMF using EEMD after j steps of noise addition, ππ (. ) is the mean corresponding to ith IMF, L is the length of data, and n is the number of IMFs using equation (13). π = βπππ2π β β 1 .
(13)
If the standard deviation obtained from equation (12) is smaller than the primary standard deviation of error, then the amplitude of noise reduces the error. The amplitude of noise is updated using equation (11), and the pervious steps are repeated to obtain a. Otherwise, the best amplitude of added noises a corresponding to ensemble number N is found. Therefore, repeating equation (11) and equation (12) form a recursive loop to converge the optimal values of π and π, as Fig. (1). The number of ensemble noises of M-EEMD is lower than this number in EEMD method; and accuracy of our algorithm is higher than EMD. The notion of local Hilbert spectrum is more efficient than the notion than Hilbert spectrum. Hilbert spectrum of a signal shows the amplitude contribution at any moment of time and frequency bin. While, the local Hilbert spectrum shows the correlation between Hilbert spectrum signal and Hilbert spectrum IMFs at any moment and frequency bin. EMD generates some phantom sources; we
propose to discard IMFs whose local spectrum is less than an experimental threshold. The other m IMFs have to satisfy equation (14).
where (. )π stands here as transpose operator and ππ is the π π‘β moment. The recursive multiplicative update step for gradient descent is given by:
(14)
β 0.01 Γ π»π₯ (π‘, π) β π»π₯ (π‘, π) < β π»π₯ (π‘, π) β π»π (π‘, π)
The Local Hilbert Spectrum (LHS) is defined as equation (15). π»πβ² (π)
= π»π₯ (π). π»π (π)
,
(15)
πππ π = 1,2, β¦ , π.
B. Non-Gaussian NMF Now we assume the products of first and second column of the matrix W in the first and second row of the matrix H gives first and second source Hilbert spectrum, respectably, as equation (16). (16)
π = ππ» = π(: ,1)π»(1, : ) + π(: ,2)π»(2, : ) = π1 π»1 + π2 π»2
in according to the central limit theorem that the sum of two independent sources is more Gaussian than either of the sources, we assume that each of the matrices and π1 π»1 and π2 π»2 are more non-Gaussian than matrix π. Therefore, using maximization of non-Gaussianity matrices π1 π»1 and π2 π»2 , correct estimation of the sources are
obtained. To use non-Gaussianity in NMF, we must have a quantitative measure of non-Gaussianity of a random matrix π1 π»1 and π2 π»2 . In this section we used from criteria kurtosis to measurement of non-Gaussianity as stated in equation (17). ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο πΎπ’ππ‘ππ ππ (π₯) = πΈ{π₯ 4 } β 3(πΈ{π₯ 2 })2 ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο¨ο±ο·ο©ο Therefore, the cost function is introduced in the equation (8) becomes as bellow equation. 1
πΌππ’ππ‘(π1 π»1 )+π½ππ’ππ‘(π2 π»2 )
2
ππ’ππ‘(π1 π»1 )Γππ’ππ‘(π2 π»2 )
π½ = βπ β ππ»β2πΉ +
ππ½
=
π
2
1
( βπ,π(ππ,π β ππ,1 π»1,π + ππ,2 π»2,π ) +
πππ‘,1 2
1
πΌ(π΅π βπ,π(ππ,1 π»1,π )2 )2 1 β (ππ,1 π»1,π )4 π΅π π,π
)
(19)
After simplifying the equation (19), its matrix notation can be written as: ππ½ ππ1
+πΌ
= β(π β π1 π»1 + π2 π»2 )π»1π
ππ½ ππ1
(21)
where π is positive learning rate that it is chosen so that the first and third term in equation (21) is removed as equation (22). π = π1 ./((π β π1 π»1 + π2 π»2 )π»1π π2 (π1 π»1 )ππππ(π»12 )π1 β4πΌ π42 (π1 π»1 ) 2 (π π2 1 π»1 )ππππ(π»14 )π13 +4πΌ ) π42 (π1 π»1 )
(22)
Equation (22) give us the following update π1 and π»1 : π1 β π1 .β [ππ»1π ]./[(π β π1 π»1 + π2 π»2 )π»1π π2 (π1 π»1 )ππππ(π»12 )π1 β4πΌ π42 (π1 π»1 ) 2 (π π2 1 π»1 )ππππ(π»14 )π13 +4πΌ ] π42 (π1 π»1 )
(23)
π»1 β π»1 .β [π1π π]./[(π β π1 π»1 + π2 π»2 )π»1π π2 (π1 π»1 )ππππ(π12 )π»1 β4πΌ π42 (π1 π»1 ) 2 (π π2 1 π»1 )ππππ(π14 )π»13 +4πΌ ] π42 (π1 π»1 )
(24)
Where .β and ./ are element-wise multiplication and divide, respectively. Similarly, the updates for π2 and π»2 can be obtained.
ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο ο¨ο±οΈο©ο
Where πΌ and π½ parameters control the non-Gaussianity of matrices π1 π»1 and π2 π»2 . Differentiating π½ with respect to a given element π1 gives: πππ‘,1
π1 β π1 β π
(20)
(1) Initialize π=1and ensemble number N =2 (2) Assume ππ = πππ(π(π)) and the standard deviation π of error πΉπ = π . βπ΅ (3) Decompose the signal with EEMD and obtain the matrix of IMFs. (4) Increment m and Calculate πΉπ = π π π π β [ βπ΅ βπ³ |π (π ) β πππ (ππ )|]. π π=π π΅ π=π π³ π=π π π (5) If πΉπ < πΉπβπ then set ππ = πΉπ βπ΅ and go to step 3, π else if πΉπ > π then N =2N and go to step 3. ππ else optimum values of N and a are found. Figure 1. Proposed modified EEMD
4π2 (π1 π»1 )ππππ(π»12 )π1 β 4π22 (π1 π»1 )ππππ(π»14 )π13 π42 (π1 π»1 )
1676
IV.
SIMULATION RESULT
V.
In this section, computer simulations are used to verify the accuracy of the proposed SCSS method and to compare it to other BSS algorithms. In these simulations, proposed algorithms and NMF2D algorithm [12], [13] are tested using a singlechannel signal that includes piano and trumpet mixture. To evaluate the proposed method and other methods in the separation process, a criterions is used called Signal to Noise Ratio (ISNR) [5]. Fig. (2).a shows the time series of mixture of piano and trumpet, which are mixed at ISNR 0dB. Fig. (3). b shows the spectrum of mixed signal and Fig.(3).c and Fig. (3).d shows the spectrum of the estimated trumpet and piano from mixture with ISNR 51 dB, respectively.
In this paper, we proposed a new nonnegative matrix factorization based on maximization of non-Gussainity sources using local Hilbert spectrum. The simulation result indicate that the proposed method improve the quality of reconstruction better than NMF. Proposed method decrease the time consume more than NMF2D. REFERENCES [1] [2] [3]
1 150 0.5
[4] 100
0
-1
[5]
50
-0.5
0
2
4
a
6
0
150
150
100
100
50
50
1
2
b
3
4
[6] [7]
20
40
c
60
[8]
20
40
d
60
Figure 2. a. Time domain mixed signal, b. Spectrogram of mixed signal , c. Spectrogram of piano estimation, d. Spectrogram of trumpet estimation.
TABLE I shows that the ISNR of proposed method is higher than traditional NMF. However, the proposed method in this paper is converged faster than NMF2D. Table I. ISNR and Time consume mixed signal with three different separating methods. Method
ISNR(dB)
NMF [4]
46.0034
Time Consume (Sec) 11.4943
NMF2D [12], [13]
54.3961
36.8092
proposed method
51.0356
20.6075
1677
CONCLUSION
P. Comon, and C. Jutten, βHANDBOOK OF BLIND SOURCE SEPARATIONβ, 1st ed., John Wiley & Sons, New Yourk, Feb 2010. H.G. Ma, Q. Bojiang, Z.Q. Liu, G. Liu, Z.Y. Ma, βA novel blind source separation method for single-channel signal,β Signal Process., vol. 90, no. 8, pp. 3232-3241, 2010. N.E. Huang, Z. Shen, S.R. Long, M.L. Wu, H.H. Shih, Q.Zheng, N.C. Yen, C.C. Tung and H.H. Liu, βThe empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysisβ, Proc. Roy. Soc. London A, Vol. 454, pp. 903β 995, 1998. T. O. Virtanen, Monaural sound source separation by perceptually weighted non-negative matrix factorization, Tech. Rep., Tampere University of Technology, 2007. M.K. I. Molla, and K. Hirose, βSingle-mixture audio source separation by subspace decomposition of Hilbert spectrumβ, IEEE Trans. On Audio, Speech, and Lang. Proc., vol. 25, no.3, pp. 893900, Mar. 2007 Z. Wu and N. E. Huang, βEnsemble empirical mode decomposition: A noise assisted data analysis methodβ, Advances in Adaptive Data Analysis, vol. 1, no. 1, pp. 1-41, Jan. 2008. P. Paatero and U. Tapper, βPositive matrix factorization: A nonnegative factor model with optimal utilization of error estimates of data values,β Environ metrics, vol. 5, no. 2, pp. 111β126, 1994.
D.D. Lee and H. Seung, βAlgorithms for non-negative matrix factorizationβ, Adv. Neural Inform. Process Systems, vol 13, pp. 556β 562, 2001. [9] R. Zdunek, and A. Cichocki, βNon-negative matrix factorization with quasi-newton optimizationβ, In Proceedings of the Eighth International Conference on Artificial Intelligence and Soft Computing, ICAISC, June 25β29, 2006. [10] P. Padilla, M. LΓ³pez, J. M. GΓ³rriz, J. RamΓrez, D. Salas-GonzΓ‘lez, I. Γlvarez, βNMF-SVM Based CAD Tool Applied to Functional Brain Images for the Diagnosis of Alzheimerβs Diseaseβ, IEEE Trans. on medical imaging, vol. 31, No. 2, 2012. [11] S. Nakano, K. Yamamoto and S. Nakagawa, βFast NMF based approach and improved VQ based approach for speech recognition from mixed soundβ , Asia-Pacific Signal & Information Processing Association Annual Summit and Conference, pp. 1-4, 2012. [12] M.N. Schmidt and M. MΓΈrup, βNonnegative matrix factor 2-D deconvolution for blind single channel source separationβ, in Proc. of ICA'06, Charleston, SC, USA, 2006. [13] S. Kirbiz and B. Gunsel, βA perceptually enhanced blind singlechannel audio source separation by non-negative matrix factorizationβ, 18th European Signal Processing Conference (EUSIPCO), pp.731-735, 2010.