entropy Article
Sparse-Aware Bias-Compensated Adaptive Filtering Algorithms Using the Maximum Correntropy Criterion for Sparse System Identification with Noisy Input Wentao Ma 1,2, * ID , Dongqiao Zheng 1 and Xianzhi Hu 3 1 2 3
*
ID
, Zhiyu Zhang 1 , Jiandong Duan 1,2, *, Jinzhe Qiu 1
School of Automation and Information Engineering, Xi’an University of Technology, Xi’an 710048, China;
[email protected] (D.Z.);
[email protected] (Z.Z.);
[email protected] (J.Q.) State Key Laboratory of Electrical Insulation and Power Equipment, Xi’an Jiaotong University, Xi’an 710049, China Management Center of Internet Information, Xi’an University of Technology, Xi’an 710048, China;
[email protected] Correspondence:
[email protected] (W.M.);
[email protected] (J.D.)
Received: 2 April 2018; Accepted: 7 May 2018; Published: 25 May 2018
Abstract: To address the sparse system identification problem under noisy input and non-Gaussian output measurement noise, two novel types of sparse bias-compensated normalized maximum correntropy criterion algorithms are developed, which are capable of eliminating the impact of non-Gaussian measurement noise and noisy input. The first is developed by using the correntropy-induced metric as the sparsity penalty constraint, which is a smoothed approximation of the `0 norm. The second is designed using the proportionate update scheme, which facilitates the close tracking of system parameter change. Simulation results confirm that the proposed algorithms can effectively improve the identification performance compared with other algorithms presented in the literature for the sparse system identification problem. Keywords: bias-compensated; correntropy-induced metric; maximum correntropy criterion; noisy input; proportionate update; sparse system identification
1. Introduction The least mean square (LMS), normalized least mean square (NLMS), least mean fourth (LMF), and normalized least mean fourth (NLMF) algorithms have been widely used in adaptive system identification, channel estimation, and echo cancellation [1–4] due to their low complexity and easy implementation. However, their performance is usually degraded severely when they are subject to output noise with non-Gaussian characteristics. Correspondingly, many robust adaptive filter algorithms (AFAs) have been developed to reduce the impact of non-Gaussian measurement noise, such as the maximum correntropy criterion (MCC) [5–8], least mean mixed norm (LMMN) [9], and least mean absolute deviation (LMAD) [10] algorithms, and so on. Among them, the MCC has been utilized to design different robust algorithms (including the diffusion MCC [11], kernel MCC [12], generalized MCC [13], group-constrained MCC, and reweight group-constrained MCC [14] algorithms, and so on [15,16]) to solve different engineering problems. Although those algorithms mentioned above show robustness against non-Gaussian measurement noise, they show weak performance for sparse system identification (SSI) problems because they do not take advantage of prior knowledge of the system. As a result, two main technologies have been Entropy 2018, 20, 407; doi:10.3390/e20060407
www.mdpi.com/journal/entropy
Entropy 2018, 20, 407
2 of 15
developed to address the SSI problem: one using the compressed sensing approach [17,18], and the other employing the proportionate update scheme [19]. At present, the zero attracting (ZA) algorithm and reweight zero attracting algorithm belonging to the former (such as ZANLMS [20], ZANLMF [21], ZAMCC [22] and General ZA-Proportionate Normalized MCC [23]) have been proposed based on the sparse penalty term (SPE). The correntropy-induced metric (CIM) [5], as an effective SPE, has been utilized to improve the performance of the algorithm in SSI, resulting in the CIM-NLMS, CIM-NLMF, CIM-LMMN, and CIM-MCC algorithms [22,24,25]. Correspondingly, the latter algorithms are proportionate-type AFAs (including proportionate NLMS [19], proportionate NLMF [26], proportionate MCC [27], and so on) which use the gain matrix to improve performance. Although those algorithms above make full use of the sparsity of the system, they lack consideration of the noisy input problem. As a result, more and more bias-compensated aware AFAs with the unbiasedness criterion have been developed to eliminate the influence from noisy input signals [28–31]. These include, for example, the bias-compensated NLMS (BCNLMS) algorithm [28–30], bias-compensated NLMF algorithm [31], bias-compensated affine projection algorithm [32], bias-compensated NMCC (BCNMCC) [33], and so on. However, they do not consider the sparsity of the system. On the basis of the analysis above, we develop two novel algorithms called bias-compensated NMCC with CIM penalty (CIM-BCNMCC) and bias-compensated proportionate NMCC (BCPNMCC) in this work. The former introduces the CIM into the BCNMCC algorithm, while the latter combines the unbiasedness criterion and the PNMCC algorithm. Both of them can achieve better performance than MCC, CIM-MCC, and BCNMCC for SSI under noisy input and non-Gaussian measurement noise. The rest of the paper is organized as follows: In Section 2, the BCNMCC algorithm is briefly reviewed. In Section 3, the CIM-BCNMCC and BCPNMCC algorithms are developed. In Section 4, simulation results are presented to evaluate the performance of the proposed algorithms. Finally, conclusions are made in Section 5. 2. Review of the BCNMCC 2.1. NMCC Consider an FIR model with a sample of the observed output signal d(n) defined as d ( n ) = u ( n ) T wo + v ( n )
(1)
where u(n) = [u(n), u(n − 1), . . . , u(n − M + 1)] T denotes the input signal; wo = [w1o , w2o , . . . , woM ] T is the estimated M-taps weight vector; and v(n) denotes the output measurement noise at time index n, described as non-Gaussian in this work. In order to eliminate the impact of the impulsive noise, the MCC is usually used as a cost function to design robust AFAs, and it is defined as [5] J MCC (n) = exp(−
e2 ( n ) ) 2σ2
(2)
where e(n) = d(n) − uT (n)w(n) is the instantaneous estimation error, w(n) = [w1 (n), w2 (n), . . . , wM (n)]T denotes the estimated tap coefficients vector of wo = [w1o , w2o , . . . , woM ] T , and σ represents the kernel width which can be manually set. Using Equation (2) and the gradient method, we have w(n + 1) = w(n) + µ exp(−
e2 ( n ) e ( n )u( n ) ) T 2 2σ u (n)u(n) + ε
(3)
where µ denotes the step size, and ε is a positive parameter. Equation (3) is the update equation of NMCC.
Entropy 2018, 20, 407
3 of 15
2.2. BCNMCC To identify the system parameter under noisy input and non-Gaussian output measurement noise, a bias-compensated normalized maximum correntropy criterion (BCNCC) algorithm has been previously proposed [33]. The u(n) and e(n) terms should be replaced by u(n) and e(n) in Equation (3) due to the noisy input, and we define the noisy input vector as u(n) = u(n) + vin (n)
(4)
2. where vin (n) = [v(n), v(n − 1), . . . , v(n − M + 1)] T is the input noise with zero mean and variance σin
e ( n ) = d ( n ) − u T ( n )w( n )
(5)
To compensate the bias caused by the input noise, the BCNMCC algorithm is developed by introducing a bias-compensation term B(n) into the weight update equation of the normalized MCC algorithm as follows: w(n + 1) = w(n) + µ exp(−
e2 ( n ) e ( n )u( n ) ) + B ( n ). 2σ2 uT (n)u(n) + ε
(6)
e (n) = wo − w(n) and Equation (6) yields To compute B(n), using the weight–error vector (WEV) w e ( n + 1) = w e (n) − µ exp(− w
e2 ( n ) e ( n )u( n ) ) − B ( n ). 2σ2 uT (n)u(n) + ε
(7)
Furthermore, the following unbiasedness criterion [28] is employed e (n + 1)|u(n) ] = 0 E [w
whenever
e (n)|u(n) ] = 0 E [w
(8)
and by some simplified calculations, we have B(n) = µ exp(−
2 σin v2 ( n ) ) . 2σ2 uT (n)u(n) + ε
(9)
Now, combining Equations (6) and (9), we obtain w( n + 1) =
! 2 σin v2 ( n ) e2 ( n ) e ( n )u( n ) 1 + µ exp( ) w ( n ) + µ exp (− ) T 2 2 T −2σ u (n)u(n) + ε 2σ u (n)u(n) + ε
(10)
which is the weight update equation of the BCNMCC algorithm. 3. Sparse-Aware BCNMCC Algorithms 3.1. CIM-BCNMCC 3.1.1. Correntropy-Induced Metric In this subsection, we focus on developing a novel sparse BCNMCC algorithm with CIM to solve the SSI problem. Here, we first briefly review the CIM. Given two vectors X = [ x1 , x2 , . . . x N ] T and Y = [y1 , y2 , . . . y N ] T in the sample space, the CIM is defined as 1/2 CI M ( X, Y ) = (κ (0) − Vˆ ( X, Y ))
(11)
Entropy 2018, 20, 407
4 of 15
N √ where κ (0) = 1/(σ 2π ), and Vˆ ( X, Y ) = 1/N ∑ κ ( xi , yi ) is the sample estimation of the correntropy.
Entropy 2018, 20, x FOR PEER REVIEW
4 of 15
i =1
The most popular kernel in correntropy is the Gaussian kernel κ ( x, y) = T
N
(0)be approximated The `0 norm of the vector X = [ x1 , x2 , . . .2 , x N ] can by 2 2 || X 0 ||~ CIM ( X ,0)
(1 exp( x
√1 σ 2π
e2 exp(− 2σ 2 ),
e = x − y.
/ 2 ))
(12) . κ (0) (12) || X0 || ∼ CI M2 ( X, 0) = (1 − exp(− xi2 /2σ2 )). In Equation (12), if | xi | , xi 0 , then as N 0i∑ , it can become close to the 0 norm, where =1 N
i
i 1
N
is a small positive number. Figure 1 shows the surface of the CIM (X,0) with X x1 , x2 which is In Equation (12), if | xi | > δ, ∀ xi 6= 0, then as σ → 0 , it can become close to the `0 norm,, where δ plotted aspositive a function of x1Figure and 1x2shows . Due the to its relation with is a small number. surface of the CIcorrentropy, M ( X, 0) withthis X =nonlinear which is [ x1 , x2 ] T , metric plotted as acorrentropy-induced function of x1 and x2 .metric Due to(CIM) its relation withprovide correntropy, thisapproximation nonlinear metric called 0 called the and can a good foris the the correntropy-induced metric (CIM) and can provide a good approximation for the ` norm. Hence, 0 to exploit the norm. Hence, it favors sparsity and can be used as a sparse principal component (SPC) it favorssparsity sparsityin and be used aasproof a sparse (SPC) to exploit the system sparsity system SSIcan scenarios; can principal be found component elsewhere [5]. in SSI scenarios; a proof can be found elsewhere [5]. T
Figure 1. 1. Correntropy-induced Correntropy-induced metric metric (CIM) 0.02).). Figure (CIM) surface surface plot plot showing showing distance distance regions regions (σ ( = 0.02
norm, the the CIM CIM favors favors sparsity sparsity and and can be used as a penalty As an approximation of the `00 norm, term in the SSI problem. The CIMMCC algorithm has been proposed [22] to the SSI problem. to solve solve sparse sparse channel channel estimation environment. TheThe goalgoal of this workwork is to develop a robust and sparse estimation in inan animpulsive impulsivenoise noise environment. of this is to develop a robust and AFA combining the BCNMCC and CIM. CIM imposes a zeroaattraction of theof filter sparse AFA combining the BCNMCC andThe CIM. Theconstraint CIM constraint imposes zero attraction the coefficients according to the relative value ofvalue each coefficient among allamong the entries which, in turn, leads filter coefficients according to the relative of each coefficient all the entries which, in to improved when the system sparse. turn, leads toperformance improved performance when is the system is sparse.
3.1.2. 3.1.2. CIM-BCNMCC CIM-BCNMCC In areare designed by by adding a derivative of anofapproximated `0 norm 0 In general, general,the theSPC-aware SPC-awareAFAs AFAs designed adding a derivative an approximated with respect to the weight in the update equation of a given original AFA. Naturally, we can develop norm with respect to the weight in the update equation of a given original AFA. Naturally, we can the BCNMCC algorithm with CIM (denoted as CIM-BCNMCC) by combining Equation (10) and the develop the BCNMCC algorithm with CIM (denoted as CIM-BCNMCC) by combining Equation (10) gradient of Equation (12) with a sparse controlling factor. Then, we obtain the weight update vector of and the gradient of Equation (12) with a sparse controlling factor. Then, we obtain the weight update the CIM-BCNMCC algorithm as vector of the CIM-BCNMCC algorithm as 2 σ2 2 v2 (n) in v ( n ) in w + µ exp( )−2σ2 ) uT (n)u(n)+ wε ( n )w(n) w(( nn +1)1) = 1 1exp( 2 2 u T ( n ) u ( n ) (13) e2 ( n ) e ( n )u( n ) w( n )2 1√ (13) +µ exp(−e 22σ ) − ρ w ( n ) exp (− ) 2 ( n2) uTe(n( n)u) u n) ε w2σ ( n12) (n()+ Mσ3 12π exp( ) T 1 w( n ) exp( ) 2 3 2 u (n)u(n) 2 12 M 1 2 where is a sparse controlling factor. It is worth noting that the kernel width 1 should be selected suitably to ensure that the CIM is closer to the 0 norm.
Entropy 2018, 20, 407
5 of 15
Entropy 2018, 20, x FOR PEER REVIEW where ρ is a sparse controlling factor.
5 of 15 It is worth noting that the kernel width σ1 should be selected suitably to ensure that the CIM is closer to the `0 norm. Remark 1. The CIM‐BCNMCC takes advantage of the BCNMCC algorithm and CIM; hence, it can solve the SSI problem under noisy input and noise ofwith impulsive algorithm characteristics. We can know that the Remark 1. The CIM-BCNMCC takesoutput advantage the BCNMCC and CIM; hence, it can solve CIM‐BCNMCC will reduce to the BCNMCC algorithm when . As a result, the extra computation 0 the SSI problem under noisy input and output noise with impulsive characteristics. We can know that the complexity is from the third term in the right‐hand side of Equation (13) compared with BCNMCC, but it can CIM-BCNMCC will reduce to the BCNMCC algorithm when ρ = 0. As a result, the extra computation obtain a more perfect identification effect. complexity is from the third term in the right-hand side of Equation (13) compared with BCNMCC, but it can
obtain a more perfect identification effect. 3.2. BCPNMCC 3.2. BCPNMCC 3.2.1. PMCC 3.2.1. PMCC The proportionate MCC algorithm has been proposed previously [27], and its weight update The proportionate MCC algorithm has been proposed previously [27], and its weight update equation is equation is e 2 ( n2) w ( n 1) w ( n ) G( n ) exp( e2 ()ne()n )u( n ) (14) w(n + 1) = w(n) + µG(n) exp(− ) e ( n )u( n ) (14) 2 2σ2 G((n n)) =diag ( g1((gn1),(n g 2)(, ng),..., g,M. .(n. )) which which can update the step size of each where where G diag , g Mis a diagonal matrix (n)) is a diagonal matrix can update the step size of 2 (n) eachadaptively. tap adaptively. individual ) governing the step sizeadjustment adjustmentof ofthe the nth nth weight weight g l (ng)l ( n governing the step size tap The The individual gain gain coefficient is defined as coefficient is defined as g l (n)
l (n) M
(n)
1 l M
(15)
i
i 1
and and
) max( max( , | w(1n(n)|),|,||w w22((n n))||,..., wl (, n|w ) |)(n )|) l ( nmax γl (n)= (ξmax (δ,|w , . .| .w, M|w(nM) (|),n|)|) 1 l
(16) (16)
and positive are positive parameters, with typical values 0.01 . where 5/Mδ , where ξand δ are parameters, with typical values ξ = 5/M, =0.01. 3.2.2. BCPNMCC 3.2.2. BCPNMCC In this subsection, we mainly develop a bias-compensated PNMCC algorithm to improve the In this subsection, we mainly develop a bias‐compensated PNMCC algorithm to improve the performance of the PMCC algorithm for the noisy input case; its derivation will be given carefully. performance of the PMCC algorithm for the noisy input case; its derivation will be given carefully. Just like the BCNMCC algorithm, we introduce a new term B(Bn()ninto Equation (14) to compensate for ) into Equation (14) to compensate Just like the BCNMCC algorithm, we introduce a new term the bias caused by the input noise: for the bias caused by the input noise: ee((nn))uu(n ee22( n ) ( n) ) )+ µG( n (n) )exp( exp(− ww((nn+1)1)=ww G 2σ22 )) uTT (n)G(n)u(n) + ε +BB(n( n).) ( n()n u ( n )G ( n ) u ( n ) 2 .
(17) (17)
Considering the input noise, the output error is then denoted as Considering the input noise, the output error is then denoted as n) d(dn(n) )− ((uu((nn) ) + v inv(nT)w (n)) e(n) = de((nn)) −du(nT)(nu)w(n()nw)(= in ( n )w( n )) T T T T T (v eu(n(n))+ = u (n)w vin w n)(n )v(− n)vinv(inn(n)w )w((nn)) =ewe(wn() n )v(+ n)v(vnTin)(− n) w (n()n)w(n) T
T
(18) (18)
~ ( n ) stands for the a priori error. Then, combining Equation (17) and WEV where uuT T( n()nw e (n) stands for the a priori error. Then, combining Equation (17) and WEV yields where ewe w((nn)) = )w yields e2 ( n ) e ( n )u( n ) e ( n + 1) = w e (n) − µG(n) exp(−e 2 ( n2) ) T e ( n )u ( n ) w − B ( n ). (19) 2σ u ( n ) G( n )u( n ) + ε B ( n ) ( n 1) w ( n ) G ( n ) exp( w ) (19) 2 2 u T ( n )G ( n )u ( n ) . In order to obtain B(n), taking the expectation on both sides of Equation (19) for the given u(n), In order to obtain B(n) , taking the expectation on both sides of Equation (19) for the given we have u (n ) , we have
e (n + 1)|u(n)] = E[w e (n)|u(n)] − µE f (e(n)) E [w
G( n ) e ( n )u( n ) | u ( n ) + E[ B(n)|u(n)]. T ( n(n))G e ((nn )u)(u n)(n ) + ε ~ (n 1) u (n) Ew ~ (n) u (n) E f (e (n))u G E w u (n) E B(n) u (n)
uT (n)G(n)u (n)
.
By using the unbiasedness criterion given by Equation (8) in Equation (20), we obtain
(20)
(20)
Entropy 2018, 20, 407
6 of 15
By using the unbiasedness criterion given by Equation (8) in Equation (20), we obtain E[B(n)|u(n)] = µE f (e(n))
G( n ) e ( n )u( n ) |u( n ) u T ( n )G( n )u( n ) + ε
(21)
where f (e(n)) denotes a nonlinear function of the estimation error defined by 2
f (e(n)) = exp(−
e2 ( n ) (ew (n) + v(n) − vin T (n)w(n)) ) = exp(− ). 2 2σ 2σ2
(22)
To derive the BCPNMCC algorithm reliably, the following commonly used assumptions [29,30,34] should be given firstly. e (n) are statistically independent. Assumption 1. The signals v(n), vin (n), u(n), and w Assumption 2. The nonlinear function of the estimation error f (v(n)), vin (n), G(n), and e(n) are statistically independent. Assumption 3. The successive increments of tap weights are independent of one another, and the error and input vector sequences are statistically independent to one another. In order to facilitate the nonlinear function f (e(n)) and simplify the mathematical derivation, T ( n )w e (n) around v(n) and use we take the Taylor expansion of f (e(n)) with respect to ew (n) − vin Equation (22) to yield i h 2 f (e(n)) ≈ f (v(n)) + f 0(v(n))[ew (n) − vin T (n)w(n)] + o [ew (n) − vin T (n)w(n)] .
(23)
Using Equation (23), we obtain the following approximation of Equation (21): h i h i G( n ) e ( n )u( n ) G( n ) e ( n )u( i ) E f (e(n)) T u(n) ≈ E f (v(n)) T u( n ) | | u (n)G(n)u(n)+ε u (n)G(n)u(ni)+ε h G( n ) e ( n )u( n ) T + E f 0(v(n))[ew (n) − vin (n)w(n)] uT (n)G(n)u(n)+ε |u(n) . i h h i 2 G( n ) e ( n )u( n ) T |u( n ) + E o [ew (n) − vin (n)w(n)] T
(24)
u (n)G(n)u(n)+ε
In general, the a priori error ew (n) converges to a small value when the algorithm is close to its steady state, and it can be ignored with respect to the environmental noise when the step size is small [6]. Under Assumptions 1, 2, and 3 and considering the fact aforementioned, the second part of Equation (24) becomes i h G( n ) e ( n )u( n ) | u( n ) E f 0(v(n))[ew (n) − vin T (n)w(n)] T u (n)G(n)u(n)+ε i h . ≈ − E f 0(v(n))vin T (n)w(n) TG(n)e(n)u(n) |u(n) ≈ 0
(25)
u (n)G(n)u(n)+ε
Furthermore, the third part of Equation (24) becomes h i 2 E o [ew (n) − vin T (n)w(n)]
G( n ) e ( n )u( n ) u ( n ) ≈ 0. | u T ( n )G( n )u( n ) + ε
(26)
Combining Equations (24)–(26), and applying Assumption 3, we have h i G( n ) e ( n )u( n ) E f (e(n)) T | u( n ) u (n)G(n)u(n)+ε i h h ≈ E f (v(n)) TG(n)e(n)u(n) |u(n) = E[ f (v(n)|u(n))] E u (n)G(n)u(n)+ε
G( n ) e ( n )u( n ) |u( n ) uT (n)G(n)u(n)+ε
i
(27)
Entropy 2018, 20, 407
7 of 15
T ( n )w( n ), we have Considering the fact that e(n) = e(n) − vin
E[G(n)e(n)u(n)|u(n)] E[G(n)|u(n)] E[e(n)u(n)|u(n)] G( n ) e ( n )u( n ) |u( n ) = = E T T u ( n )G( n )u( n ) + ε u ( n )G( n )u( n ) + ε u T ( n )G( n )u( n ) + ε
(28)
where T ( n )w( n )][u( n ) + v( n )]|u( n ) E[e(n)u(n)|u(n)] = E [e(n) + vin T T = E[e(n)u(n)|u(n)] + E[e(n)v(n)|u(n)] + E vin (n)w(n)u(n)|u(n) + E vin (n)w(n)vin (n)|u(n)
(29)
Under Assumptions (1)–(3), we have e (n)][u(n) + vin (n)]|u(n) E[e(n)u(n)|u(n)] = E [v(n) + uT (n)w T e (n)u(n)|u(n) + E uT (n)w e (n)vin (n)|u(n) = E[v(n)u(n)|u(n)] + E[v(n)vin (n)|u(n)] + E u (n)w =0
(30)
and E[vin T (n)w(n)u(n)|u(n)] = 0
(31)
2 E[vin T (n)w(n)vin (n)|u(n)] = σin E[w(n)|u(n)].
(32)
Combining Equations (28) and (29)–(32), we get " # 2 w( n ) σin e ( n )u( n ) E T |u( n ) = E T |u( n ) . u ( n )G( n )u( n ) + ε u ( n )G( n )u( n ) + ε
(33)
Then, substituting Equations (28) and (33) into Equation (21) yields " # 2 w( n ) σin v2 ( n ) E[B(n)|u(n)] = µE exp( )|u(n) E[G(n)|u(n)] E T |u( n ) . 2σ2 u ( n )G( n )u( n ) + ε
(34)
Now, using the stochastic approximation, we have B(n) = µ exp(−
2 G(n)σin v2 ( n ) ) . 2σ2 uT (n)G(n)u(n) + ε
(35)
Substituting Equation (35) into Equation (17), we obtain w( n + 1) =
1 + µ exp(−
2 G(n)σin v2 ( n ) ) 2σ2 uT (n)G(n)u(n)+ε
w(n) + µ exp(−
e2 ( n ) ) G( n ) e ( n )u( n ) . 2σ2 uT (n)G(n)u(n)+ε
(36)
As a result, the update equation of the proposed BCPNMCC algorithm is derived in Equation (36). Remark 2. The structure of Equation (36) is similar to that of Equation (10), and the proposed BCPNMCC algorithm will reduce to the BCNMCC algorithm when the gain matrix is the unity matrix. In addition, we know that Equation (36) will be equivalent to Equation (15) when the input noise variance is zero. Hence, the proposed BCPNMCC algorithm shows advantages of both the BCNMCC and PMCC algorithms. Furthermore, the variance of the input noise is usually unknown and it should be estimated effectively; we employ the 2 in this work. method proposed in [35] to estimate σin 4. Simulation Results In this section, we present computer simulations to evaluate the performance of the proposed CIM-BCNMCC and BCPNMCC algorithms, and we select the MCC, CIM-MCC, and BCNMCC
Entropy 2018, 20, x FOR PEER REVIEW
8 of 15
4. Simulation Results Entropy 2018, 20, 407
8 of 15
1
Magnitude
Magnitude
In this section, we present computer simulations to evaluate the performance of the proposed CIM-BCNMCC and BCPNMCC algorithms, and we select the MCC, CIM-MCC, and BCNMCC algorithms algorithms as as comparison comparison objects. objects. In In the the following following simulations, simulations, we we set set the the parameter parameter vector vector of of an an unknown time-varying system as shown in Figure 2. unknown time-varying system as shown in Figure 2.
0.5 0 0
5
10
15 20 Coefficient index
25
30 32
1 0.5 0 0
10
20
30 40 Coefficient index
0.5 0 0
10
20
60 64
100
120128
(b)
1
Magnitude
Magnitude
(a)
50
30 40 Coefficient index
50
60 64
1 0.5 0 0
20
40
60 80 Coefficient index
(c)
(d)
Figure 2. Impulsive response of the sparse system: (a) sparsity rate (SR) = 5/32; (b) SR = 5/64; (c) SR = Figure 2. Impulsive response of the sparse system: (a) sparsity rate (SR) = 5/32; (b) SR = 5/64; 9/64; SR = (d) 5/128. (c) SR (d) = 9/64; SR = 5/128.
In Figure 2, the memory size M is set at 32, 64, or 128 during different iterations. Here, we define In Figure 2, the memory size M is set at 32, 64, or 128 during different iterations. Here, we define the sparsity rate (SR) as the sparsity rate (SR) as NN non non −zeros zeros (37) (37) SR SR = M M
where woo. We assume that the background noise v(n) where NNnon non− zero zero is the number of nonzero taps in w . We assume that the background noise v (n ) has α-stable distribution distributionand andthat thatthe the input noisev v(n in ( n ) is zero-mean white Gaussian noise. has an an α-stable input noise in ) is zero-mean white Gaussian noise. The The characteristic function of the α-stable distribution is defined as characteristic function of the α-stable distribution is defined as α f (t) = jθtj− t,)]α)] (38) f (texp ) exp t γ|t||t|[1[1+ jβsgn j sgn((t )tS)S (t(, (18)
in inwhich which
( S(t, α) =
i f α 6= 1 i f α= if 11
tan απ 2 2 tan|t| π log
(39)
2 S (t , ) (39) 2 if isthe 1location parameter, β ∈ [−1, 1] is the where α ∈ (0, 2] is the characteristic factor, −∞< θ < log | t + | ∞ symmetry parameter, and γ > 0 is the dispersion parameter. We define the parameter vector of the characteristic as Vα−stable (α, factor, β, γ, θ ). is the location parameter, [1,1] is the where (0,function 2] is the characteristic All results in the following are obtained by averaging over independent symmetry parameter, and 0simulations is the dispersion parameter. We define the100 parameter vectorMonte of the Carlo runs to obtain the mean square deviation (MSD), defined as characteristic function as V -stable ( , , , ) . All results in the following simulations are obtained by(naveraging over 100 independent Monte ||wo − w )||2 MSD = 10 log10 E . (40) Carlo runs to obtain the mean square deviation (MSD),||defined wo ||2 as algorithms, (n)we || choose the optimal weight vector for In order to verify the performance new || w w . MSDofthe 10 log (40) 10 E o 2 || w || performance for each algorithm. different sparsities. The other parameters are set to achieve the best The kernel width σ of MCC is 20, and σ1 is 0.02 for CIM; the input noise signal has a mean of 1 In order to the performance of the new algorithms, we choose the optimal weight vector 2 verify and variance σin of 1, and the output measurement noise is set as Vα−stable (1.2, 0, 0.4, 0). The positive for different sparsities. The other parameters are set to achieve the best performance for each parameter ρ is 1 × 10−5 , and ε = 0.001. algorithm. of MCC is 20, and 1 is 0.02 for CIM; the input noise signal has a mean The kernel 4.1. Sparse Systemwidth Identification o
2
of 1 and variance in2 of 1, and the output measurement noise is set as V -stable (1.2,0,0.4,0) . The In the first example, we investigate the convergence performance of all algorithms mentioned 1 different 105 , andsparsity 0.001 positive . is above in parameter terms of MSD under rates. The step sizes are selected so that the same initial convergence speeds are obtained for all algorithms. The convergence curves are given in Figure 3a,b under different SRs (5/64 and 9/64). The results indicate that (1) the bias-compensated aware AFAs can achieve lower steady-state misadjustment than other algorithms because the bias
InInthe thefirst firstexample, example,we weinvestigate investigatethe theconvergence convergenceperformance performanceofofall allalgorithms algorithmsmentioned mentioned above aboveininterms termsofofMSD MSDunder underdifferent differentsparsity sparsityrates. rates.The Thestep stepsizes sizesare areselected selectedso sothat thatthe thesame same initial convergence speeds are obtained for all algorithms. The convergence curves are given initial convergence speeds are obtained for all algorithms. The convergence curves are giveninin Figures Figures3a,b 3a,bunder underdifferent differentSRs SRs(5/64 (5/64and and9/64). 9/64).The Theresults resultsindicate indicatethat that(1) (1)the thebias-compensated bias-compensated Entropy 2018, 20, 407 9 of 15 aware AFAs can achieve lower steady-state misadjustment than other algorithms aware AFAs can achieve lower steady-state misadjustment than other algorithmsbecause becausethe thebias bias caused causedby bythe thenoisy noisyinput inputcan canbe beeffectively effectivelycompensated compensatedby bythe thebias biascompensation compensationterm; term;and and(2) (2)the the caused by thehas noisy input can be effectively by the bias compensation term; and (2) noisy aaremarkable influence on the performance ofofthe noisyinput input has remarkable influence oncompensated theidentification identification performance theAFAs AFAsfor for SSI. SSI. Consequently, totoinfluence study sparse-aware bias-compensated AFAs. addition, we the noisy inputithas ameaningful remarkable the identification performance of theInIn AFAs for SSI. Consequently, itisismeaningful studythe theon sparse-aware bias-compensated AFAs. addition, we Consequently, it is meaningful study the sparse-aware bias-compensated AFAs. In addition, we with give give curves ofof the CIM-BCNMCC, and algorithms givethe theconvergence convergence curvesto theBCNMCC, BCNMCC, CIM-BCNMCC, andBCPNMCC BCPNMCC algorithms with the convergence of the CIM-BCNMCC, and BCPNMCC withproposed sparsity NNnonnon-zeros 9 in Figure 5 BCNMCC, sparsity levels and Nnon 4. One can algorithms see that the sparsity levels curves -zeros non -zeros 5 and N -zeros 9 in Figure 4. One can see that the proposed levels Nnon−zeros = 5 and Nnon−zeros = 9 in Figure 4. One can see that the proposed algorithms show algorithms algorithmsshow showbetter bettersteady-state steady-stateaccuracy accuracythan thanthe theBCNMCC BCNMCCalgorithm, algorithm,which whichillustrates illustratesthe the better steady-state accuracy than the BCNMCC algorithm, which illustrates the advantages of the advantages of the proposed methods for SSI. In particular, no matter how sparse it is, the advantages of the proposed methods for SSI. In particular, no matter how sparse it is, the proposed methods for SSI. In particular, no matter how sparse it is, the BCPNMCC algorithm performs BCPNMCC BCPNMCCalgorithm algorithmperforms performsbetter betterthan thanthe theCIM-BCNMCC CIM-BCNMCCalgorithm, algorithm,which whichisisconsistent consistentwith with better than the CIM-BCNMCC algorithm, which is consistent with the results in [24]. Hence, we mainly the theresults resultsinin[24]. [24].Hence, Hence,we wemainly mainlyperform performsimulations simulationstotoinvestigate investigatethe theperformance performanceofofthe the perform simulations to investigate the performance of the proposed BCPNMCC algorithm in the proposed BCPNMCC algorithm in the following examples. proposed BCPNMCC algorithm in the following examples. following examples. -10 -10
-10 -10
MCC =20) MCC( (=0.004, =0.004, =20) CIM-MCC =20) CIM-MCC( (=0.003, =0.003, =20) PMCC ( =0.04, =20) PMCC ( =0.04, =20) BCNMCC ( =0.03, =20) BCNMCC ( =0.03, =20) CIM-BCNMCC =20) CIM-BCNMCC( (=0.027, =0.027, =20) BCPNMCC =20) BCPNMCC( (=0.007, =0.007, =20)
-12 -12
-14 -14
MSD(dB) MSD(dB)
MSD(dB) MSD(dB)
-14 -14
-16 -16
-16 -16
-18 -18
-18 -18 -20 -20
-20 -20
-22 -22
-22 -22
-24 -24 00
MCC =20) MCC((=0.0004, =0.0004, =20) CIM-MCC =20) CIM-MCC((=0.0004, =0.0004, =20) PMCC =20) PMCC((=0.006, =0.006, =20) BCNMCC =20) BCNMCC((=0.03, =0.03, =20) CIM-BCNMCC =20) CIM-BCNMCC((=0.027, =0.027, =20) BCPNMCC =20) BCPNMCC((=0.008, =0.008, =20)
-12 -12
0.5 0.5
11 Iterations Iterations
1.5 1.5
22
-24 -24 00
4
x x10104
0.5 0.5
11 Iterations Iterations
(a) (a)
1.5 1.5
22 4 x x10104
(b) (b)
Figure Figure3.3.Convergence Convergencecurves curvesininterms termsofofmean meansquare squaredeviation deviation(MSD). (MSD).The Thelength lengthofofthe thefilter filterisis64. 64. Figure 3. Convergence curves in terms of mean square deviation (MSD). The length of the filter is 64. 5 5 22 − 5 2 1 10 SR 5/64, , , ρ= SR in in=1 11, SR===5/64, 5/64,σin (a) SR =5/64; 5/64;(b) (b)SR SR===9/64. 9/64. 0.002 .02 SR 1 1× 1010, σ,1, =1 10.02. (a) SR ==5/64; (b) SR 9/64. . .(a) -10 -10 -12 -12
BCNMCC =20) BCNMCC((=0.07, =0.07, =20) CIM-BCNMCC =20) CIM-BCNMCC((=0.07, =0.07, =20) BCPNMCC =20) BCPNMCC((=0.025, =0.025, =20)
-14 -14
-10 -10
BCNMCC =20) BCNMCC((=0.07, =0.07, =20) CIM-BCNMCC =20) CIM-BCNMCC((=0.07, =0.07, =20) BCPNMCC =20) BCPNMCC((=0.025, =0.025, =20)
MSD(dB) MSD(dB)
MSD(dB) MSD(dB)
-12 -12
-16 -16
-14 -14
-18 -18
-16 -16
-20 -20
-18 -18
-22 -22 -24 -24 00
-8-8
5000 5000
10000 10000 Iterations Iterations
(a) (a)
15000 15000
-20 -20 00
5000 5000
10000 10000 Iterations Iterations
15000 15000
(b) (b)
Figure The convergence curves ofofthe aware algorithms. (a) ==5/64; (b) SR Figure4. thebias-compensated bias-compensated aware algorithms. (a)SR SR(a) 5/64; SR== Figure 4.4. The Theconvergence convergencecurves curves of the bias-compensated aware algorithms. SR =(b) 5/64; 9/64. 9/64. (b) SR = 9/64.
In the second example, we present the convergence performance of the proposed and conventional algorithms obtained with different memory sizes represented by SR (a, SR = 5/32; b, SR = 5/64; and c, SR = 5/128). The results are shown in Figure 5; we conclude that the proportionate-type algorithms show the best performance, regardless of the length of the weight vector, compared with other algorithms. In addition, Figure 6 gives the results of a time-varying system identification case
Entropy Entropy 2018, 2018, 20, 20, xx FOR FOR PEER PEER REVIEW REVIEW
10 10 of of 15 15
In In the the second second example, example, we we present present the the convergence convergence performance performance of of the the proposed proposed and and conventional algorithms obtained with different memory sizes represented by SR (a, SR conventional algorithms obtained with different memory sizes represented by SR (a, SR == 5/32; 5/32; b, b, SR SR == 5/64; and c, SR = 5/128). The results are shown in Figure 5; we conclude that the proportionate-type 5/64; and c, SR = 5/128). The results are shown in Figure 5; we conclude that the proportionate-type Entropy 2018, 20, 407 the best performance, regardless of the length of the weight vector, compared 10 of 15 algorithms algorithms show show the best performance, regardless of the length of the weight vector, compared with with other other algorithms. algorithms. In In addition, addition, Figure Figure 66 gives gives the the results results of of aa time-varying time-varying system system identification identification case case represented by SR, i.e., we set SR = 5/64 and SR = 9/64 in the first and second stages, respectively, to represented by by SR, SR, i.e., i.e., we we set set SR = 5/64 inin the first and second stages, respectively, to represented 5/64and andSR SR==9/64 9/64 the first and second stages, respectively, exhibit the tracking ability of the proposed algorithm. One can observe that the proposed algorithm exhibit the proposed algorithm algorithm to exhibit thetracking trackingability abilityofofthe theproposed proposedalgorithm. algorithm. One One can can observe observe that the proposed can obtain the best performance no matter how sparse is. canobtain obtainthe thebest bestperformance performanceno nomatter matterhow howsparse sparseitit itis. is. can -30 -30
-10 -10
MCC MCC ((=0.006, =0.006, =20) =20) PMCC PMCC ((=0.07, =0.07, =20) =20) BCNMCC BCNMCC ((=0.3, =0.3, =20) =20) BCPNMCC BCPNMCC ((=0.12, =0.12, =20) =20)
-32 -32
-12 -12 -14 -14 MSD(dB) MSD(dB)
MSD(dB) MSD(dB)
-34 -34 -36 -36
-16 -16
-38 -38
-18 -18
-40 -40
-20 -20
-42 -420 0
MCC MCC ((=0.006, =0.006, =20) =20) PMCC PMCC ((=0.07, =0.07, =20) =20) BCNMCC BCNMCC ((=0.3, =0.3, =20) =20) BCPNMCC BCPNMCC ((=0.12, =0.12, =20) =20)
1000 1000
2000 2000 Iterations Iterations
3000 3000
-22 -220 0
4000 4000
1000 1000
(a) (a)
2000 2000 Iterations Iterations
3000 3000
4000 4000
(b) (b) -10 -10 MCC MCC ((=0.006, =0.006, =20) =20) PMCC PMCC ((=0.07, =0.07, =20) =20) BCNMCC BCNMCC ((=0.3, =0.3, =20) =20) BCPNMCC BCPNMCC ((=0.12, =0.12, =20) =20)
MSD(dB) MSD(dB)
-15 -15
-20 -20
-25 -250 0
1000 1000
2000 2000
3000 3000 4000 4000 Iterations Iterations
5000 5000
6000 6000
(c) (c) Figure Figure 5. 5. Convergence Convergence curves curves of of four four algorithms algorithms with with different different memory memory sizes. sizes. (a) (a) SR SR == 5/32; 5/32; (b) (b) SR SR == Figure 5. Convergence curves of four algorithms with different memory sizes. (a) SR = 5/32; 5/64; (c) SR = 5/128. 5/64; SR = 5/128. (b) SR(c) = 5/64; (c) SR = 5/128. -8 -8 -10 -10
MSD(dB) MSD(dB)
-12 -12
MCC MCC ((=0.006, =0.006, =20) =20) PMCC PMCC ((=0.07, =0.07, =20) =20) BCNMCC ( =0.3, BCNMCC (=0.3, =20) =20) BCPNMCC BCPNMCC ((=0.12, =0.12, =20) =20)
-14 -14 -16 -16 -18 -18 -20 -20 -22 -220 0
2000 2000
4000 4000 Iterations Iterations
6000 6000
8000 8000
Figure 6. Convergence curves of four algorithms under different SRs. Figure Figure6. 6. Convergence Convergencecurves curvesof offour fouralgorithms algorithmsunder underdifferent differentSRs. SRs.
In the third example, we compare the convergence speed of the proposed algorithm with those of other algorithms. The step sizes are selected to obtain the same MSD for each algorithm.
Entropy 2018, 20, x FOR PEER REVIEW
11 of 15
In the third example, we compare the convergence speed of the proposed algorithm with those of other algorithms. The step sizes are selected to obtain the same MSD for each algorithm. The Entropy 2018, 20, x FOR PEER REVIEW of 15 convergence curves are shown in Figure 7; one can observe that the proposed BCPNMCC11algorithm Entropy 2018, 20, 407 11 of 15 shows outstanding convergence speed compared with BCNMCC algorithm because of the adaptive In the third example, we compare the convergence speed of the proposed algorithm with those step ofsize adjustments Compared the MCC algorithm, the BCPNMCC other algorithms. by Thethe stepgain sizesmatrix. are selected to obtainwith the same MSD for each algorithm. The algorithm shows a better steady-state MSD because of the proportionate update scheme and bias convergencecurves curves are are shown shown in that thethe proposed BCPNMCC algorithm The convergence in Figure Figure7;7;one onecan canobserve observe that proposed BCPNMCC algorithm compensating term. shows outstanding convergence speed comparedwith with BCNMCC BCNMCC algorithm because of the adaptive shows outstanding convergence speed compared algorithm because of the adaptive step adjustments size adjustments by the gain matrix.Compared Compared with thethe BCPNMCC step size by the gain matrix. with the theMCC MCCalgorithm, algorithm, BCPNMCC algorithm shows a better steady-state MSD because of the proportionate update scheme andand bias bias algorithm shows a better steady-state MSD because of the proportionate update scheme -10 compensating term. MCC (=0.002, =20) compensating term. PMCC (=0.07, =20) BCNMCC (=0.1, =20) BCPNMCC ( =0.12, =20) MCC (=0.002,=20)
-12 -10
PMCC (=0.07, =20) BCNMCC (=0.1, =20) BCPNMCC (=0.12, =20)
-14
-16-14 MSD(dB)
MSD(dB)
-12
-18-16 -20-18 -20
-22 0
1000
-22 0
2000 Iterations
1000
2000
3000 3000
4000 4000
Iterations Figure 7. Convergence curves of four algorithms (SR = 5/64).
Figure 7. Convergence curvesofoffour four algorithms algorithms (SR Figure 7. Convergence curves (SR==5/64). 5/64).
In the fourth example, we perform a simulation to further evaluate the robustness of the the fourthwith example, we perform to 0.65, further of the are proposedInalgorithm different values ofa simulation 0.8,evaluate and 1). the Therobustness other parameters in2 (0.3, 0.5, In the fourth example, we perform a simulation to further robustness of the proposed proposed algorithm with different values of in2 (0.3, 0.5, 0.65,evaluate 0.8, and the 1). The other parameters set the same as for the third example. steady-state MSD is illustrated in Figure 8; we seeare that no 2 The algorithm with different values of σin (0.3, 0.5, 0.65, 0.8, and 1). The other parameters are set set the same as for The steady-state MSD isalgorithms illustrated in(including Figure 8; we BCNMCC see that no the 2the third example. matter how big in 2 is, the bias-compensated aware and same matter as for the third The bias-compensated steady-state MSDaware is illustrated in Figure 8; we BCNMCC see that noand matter how bigexample. algorithms (including in is, the BCPNMCC) show better performance their (including original versions (i.e., MCC and PMCC). 2 is, the how big σin bias-compensated aware than algorithms BCNMCC BCPNMCC) show better performance than their original versions (i.e., and MCCBCPNMCC) and PMCC).show Furthermore, the PMCC andoriginal BCPNMCC algorithms show higher steady-state accuracyPMCC than do the better performance than their versions (i.e., MCC PMCC). Furthermore, Furthermore, the PMCC and BCPNMCC algorithms showand higher steady-state accuracy the than do the and MCC and BCNMCC algorithms. Furthermore, the effect of the kernel width for the proposed MCC and BCNMCC algorithms. Furthermore, the effect of do thethe kernel for the proposed BCPNMCC algorithms show higher steady-state accuracy than MCCwidth and BCNMCC algorithms. algorithm is examined. We set at 5, 10, 20, 30, 40, and 50, respectively. Figure 9 gives algorithmthe is examined. Wekernel set width at 5, 10, 40, and algorithm 50, respectively. Figure 9 We gives the Furthermore, effect of the for 20, the30, proposed is examined. set σ at the 5, steady-state MSD result of each . It is obvious that (1) the proposed BCPNMCC algorithm steady-state MSD result of each . It is obvious that (1) the proposed BCPNMCC algorithm 10, 20, 30, 40, and 50, respectively. Figure 9 gives the steady-state MSD result of each σ. It is obvious outperforms MCC, PMCC,and andBCNMCC BCNMCC algorithms; and (2)(2) thePMCC, BCPNMCC algorithm achieves outperforms thethe MCC, PMCC, algorithms; and the BCPNMCC algorithm achieves that (1) the proposed BCPNMCC algorithm outperforms the MCC, and BCNMCC algorithms; optimal performancewhen when isis 20, 20, and and aa linear relationship does not not existexist between the the optimal performance linear relationship does between and (2) the BCPNMCC algorithm achieves optimal performance when σ is 20, and a linear relationship performance and the value of , which means that the kernel width should be selected in different performance and the value of , which means that of theσ,kernel selected in different does not exist between the performance and the value whichwidth meansshould that thebekernel width should applications to achieve the desiredperformance performance .. applications to achieve the desired be selected in different applications to achieve the desired performance. -13
-13 -14
-14
-15
-15 MSD(dB)
-16
MSD(dB)
-16 -17 -17 -18 MCC (=0.006) PMCC (=0.07) BCNMCC (=0.3) MCC (=0.006) BCPNMCC (=0.12) PMCC (=0.07)
-19 -18 -20 -19 -21
-20
-22
-21
-22
0.4
0.4
0.5
0.6
0.5
0.6
0.7 2in
BCNMCC (=0.3) BCPNMCC (=0.12)
0.7
0.8
0.8
0.9
0.9
1
1
Figure 8. Steady-state MSDs with different values of in2 ; σ = 20, SR = 5/64. 2 in
2 Figure8.8.Steady-state Steady-stateMSDs MSDswith withdifferent differentvalues valuesofofσ2; inσ ;=σ20, = 20, = 5/64. Figure SRSR = 5/64. in
Entropy 2018, 20, x FOR PEER REVIEW
12 of 15
Entropy 2018, 20, x FOR PEER REVIEW Entropy Entropy2018, 2018,20, 20,407 x FOR PEER REVIEW
12 of 15 -12
1212ofof1515
MSD(dB) MSD(dB) MSD(dB)
-12 -13 -13 -14 -12 -14 -15 -13 -15 -16 -14 -16 -17 -15 -17 -18 -16 -18 -19 -17 -19 -20 -18 -20 -21 -19 -21 -20
10
20
10
20
MCC (=0.006) PMCC (=0.07) MCC (=0.006) BCNMCC (=0.3) PMCC (=0.07) BCPNMCC (=0.12) BCNMCC (=0.3) MCC (=0.006) BCPNMCC (=0.12) PMCC (=0.07) BCNMCC (=0.3) 30 40 50 BCPNMCC (=0.12) 30 40 50
Figure 9. Steady-state MSDs with different values of σ; in2 1 , SR = 5/64. -21 10
20
30
40
50
Figure 9. Steady-state MSDs with different values of σ; in2 1 , SR = 5/64.
Finally, the joint effect of the performance on the free parameters2 is quantitatively investigated, Figure9.9.Steady-state Steady-stateMSDs MSDswith withdifferent differentvalues valuesofofσ;σ;σ2= , SR= =5/64. 5/64. 1SR in Figure with Finally, a curved surface for the of steady-state MSD andin parameter pairs ( investigated, and in2 , the joint effect of relationship the performance on the free parameters is1, quantitatively and in2 , in with surface forsteady-state the relationship steady-state MSD and parameter pairs and ain2curved ) presented. The MSD of results are given in Figures 10 and 11. One(can observe Finally, the joint effect of the performance on the free parameters is quantitatively investigated, Finally, the joint effect of the performance on the free parameters is quantitatively investigated, 2 in10 and ) presented. The steady-state MSD results are given in Figures 10 andvariance 11. Oneincreasing; can observe in Figure that the steady-state accuracy will withMSD the input noise’s in2 σ,this with curved surface relationship of decrease steady-state and parameter 2 and 2 with a acurved surface forfor thethe relationship of steady-state MSD and parameter pairs pairs (σ and( σin , µ and ) in result 2is10most pronounced when the input variance output noise’s variance are in this the Figure that the steady-state accuracy will noise’s decrease with theand input noise’s variance increasing; and ) presented. The steady-state MSD results are given in Figures 10 and 11. One can observe in in presented. The steady-state MSD results are given in Figures 10 and 11. One can observe in Figure 10 rangesisofmost (0.2–1) and (10–40). A similar conclusion can be obtained in Figure as theare step result pronounced when the input noise’s variance and output noise’s 11: variance in size the Figure that the steady-state accuracy will with decrease inputvariance noise’s variance increasing; this that the 10 steady-state accuracy will decrease the with inputthe noise’s increasing; this result increases, the steady-state accuracy of the proposed algorithm decreases obviously, and it is most ranges of (0.2–1) and (10–40). A similar conclusion can be obtained in Figure 11: as the step size most pronounced when thenoise’s input variance noise’s variance and noise’s output variance noise’s variance areranges in the isresult most ispronounced when the input and output are in the pronounced the stepaccuracy size’s and input noise’s algorithm variance are in the ranges of (0.08–0.16) and increases, thewhen steady-state of the proposed decreases obviously, and it is most (0.2–1) and A (10–40). similar conclusion can be in obtained in Figure 11: as theincreases, step size ofranges (0.2–1)ofand (10–40). similarAconclusion can be obtained Figure 11: as the step size (0.2–0.1). In addition, also seeinput that noise’s the proposed algorithm outperforms the BCNMCC pronounced when the one stepcan size’s and variance are in the ranges of (0.08–0.16) and increases, the steady-state of the proposeddecreases algorithm decreasesand obviously, it is most the steady-state accuracy of accuracy the proposed algorithm obviously, it is mostand pronounced algorithmIn in addition, this case. one can also see that the proposed algorithm outperforms the BCNMCC (0.2–0.1). pronounced when and the input step size’s inputare noise’s in the ranges of (0.08–0.16) and when the step size’s noise’sand variance in thevariance ranges ofare (0.08–0.16) and (0.2–0.1). In addition, algorithm in this case. (0.2–0.1). Insee addition, can also see that the proposed algorithm algorithm outperforms thecase. BCNMCC one can also that theone proposed algorithm outperforms the BCNMCC in this -6 algorithm in this case. -6 -8
-5
-8 -10 -6
MSD (dB) (dB) (dB) MSDMSD
-5 -10
-10 -12 -8
-10 -15 -5 -15 -20 -10 -20 -25 -15 1 -25 -201
-12 -14 -10
BCNMCC
-14 -16 -12 -16 -18 -14
BCNMCC 0.8
BCPNMCC 0.6
0.8 -25 1
BCNMCC
2in0.6
2 0.8 in
40
-18 -20 -16
40
-20 -22 -18
30
BCPNMCC 20
0.4 0.2
10
0.2
10
0.4
30
20
BCPNMCC
0.6
40
-22 -20
30
20 Figure 10. Steady-state MSDs of0.4BCPNMCC for variations in -22 and in2 ; SR = 5/64. 2in 2 ; SR = 5/64. 0.2 10 Figure 10. Steady-state MSDs of BCPNMCC for variations in σ and σin Figure 10. Steady-state MSDs of BCPNMCC for variations in and in2 ; SR = 5/64.
Figure 10. Steady-state MSDs of BCPNMCC for variations in -4 and in2 ; SR = 5/64. -6 -4 0 -8 -6 -4 -10 -8
MSD (dB) (dB) (dB) MSD MSD
0 -5 -5 -10 0
BCNMCC
-6 -12 -10
-10 -15 -5
BCNMCC
-8 -14 -12
-15 -20 -10
-20 -25 0.16 -15
-10 -16 -14 -12 -18 -16
BCNMCC
BCPNMCC -25 0.14 0.16 -20 0.12 0.14 BCPNMCC 0.1 -25 0.12 0.16 0.1 0.08 0.14 BCPNMCC 0.08 0.12
1
-14 -20 -18
1
-16 -22 -20
1
-18 -22 -20
0.8 0.6 0.4 0.2
0.6
0.4 0.2
0.8
2in
2in0.6
0.8
2 2 ;SR and Figure 11. 11. Steady-state Steady-state MSDs MSDs of of BCPNMCC BCPNMCC0.4for for variations variations in in µ-22 and = 5/64. in ; SR Figure σin = 5/64. 0.08 0.2 2 2 Figure 11. Steady-state MSDs of BCPNMCC forinvariations in and in ; SR = 5/64.
0.1
Figure 11. Steady-state MSDs of BCPNMCC for variations in and in2 ; SR = 5/64.
Entropy 2018, 20, x FOR PEER REVIEW Entropy 2018, 20,20, 407 Entropy 2018, x FOR PEER REVIEW
13 of 15 1313 ofof 1515
4.2. Sparse Echo Channel Estimation 4.2. Sparse Echo Channel Estimation In this subsection, we consider a computer simulation of the scenario of sparse echo channel 4.2. Sparse Echo Channel Estimation In thistosubsection, we consider a computer simulation of the scenario sparse path echo ischannel estimation evaluate the performance of the proposed BCPNMCC. The echoofchannel given In this subsection, we consider a computer simulation of the scenario of sparse echo channel estimation to evaluate the performance of the proposed BCPNMCC. The echo channel path is given in Figure 12, and it commutes to a channel with length M = 1024 only and 56 nonzero coefficients. estimation to evaluate the performance of the proposed BCPNMCC. The echo channel path is given in in Figure 12, and it of commutes to speech a channel withinput length M = 1024 onlyatand 56 nonzero coefficients. We use a fragment 2 s of real as the signal, sampled 8 kHz. The measurement Figure 12, and it commutes to a channel with length M = 1024 only and 562 nonzero coefficients. We use We use a noise fragment 2 s of real speech as the input signal, sampled at 8 kHz. The measurement ( 1 . 2 , 0 , 0 . 2 , 0 ) impulsive is Vof , and the input noise variance is . The other parameters 0 . 25 -stable a fragment of 2 s of real speech as the input signal, sampled at 8 kHz. inThe measurement impulsive V -stable (the 1.2,0optimal ,0.2,0) , and impulsive noise to is obtain the input noise variance is in2The . The other parameters 0.25 are set in order performance for all algorithms. convergence curvesare are 2 = 0.25. noise is Vα−stable (1.2, 0, 0.2, 0), and the input noise variance is σin The other parameters are set in order to obtain the optimal performance for all algorithms. The convergence curves are shown in Figure 13.the Compared with other algorithms, the proposed BCPNMCCcurves algorithm works set in order to obtain optimal performance for all algorithms. The convergence are shown shown in Figure 13.scenario, Compared with other algorithms, the proposed BCPNMCC algorithm works well in this again demonstrating practical character of the proposed algorithm in Figure 13.practical Compared with other algorithms, the the proposed BCPNMCC algorithm works well in well in this practical scenario, again demonstrating the practical character of the proposed algorithm for engineering applications. this practical scenario, again demonstrating the practical character of the proposed algorithm for for engineering applications. engineering applications. 0.6 0.6
Amplitude Amplitude
0.4 0.4 0.2 0.2 0 0 -0.2 -0.2
MSD(dB) MSD(dB)
amplitude amplitude
-0.4 0 200 400 600 800 1000 -0.4 iteration 0 200 400 600 800 1000 iteration Figure 12. Impulse response of a sparse echo channel with length M = 1024 and 52 nonzero Figure Impulse response of a sparse echo channel with length M =521024 andcoefficients. 52 nonzero coefficients. Figure 12.12. Impulse response of a sparse echo channel with length M= 1024 and nonzero coefficients. 5
x 10
5 0
-4
x 10
0 -5 0 -5 0
-26.52 -26.52 -26.54 -26.54 -26.56 -26.56 -26.58 -26.58 -26.6 -26.6 -26.62 -26.62 -26.64 -26.64 -26.66 -26.66 -26.68 0 -26.68 0
-4
0.5
1
Time(s) 1 MCC (=0.0001,Time(s) =20) 0.5
1.5
2
1.5
2
1.5 1.5
2 2
PMCC (=0.02, =20) MCC (=0.0001, =20) BCNMCC (=0.02, =20) PMCC (=0.02, =20) BCPNMCC (=0.001, =20) BCNMCC (=0.02, =20) BCPNMCC (=0.001, =20)
0.5 0.5
1 Iterations 1 Iterations
Figure Figure13. 13.Convergence Convergencecurves curvesfor forsparse sparseecho echoresponse responsewith withspeech speechinput. input. Figure 13. Convergence curves for sparse echo response with speech input.
5. Conclusions 5. Conclusions Sparse-aware BCNMCC algorithms for sparse system identification under noisy input and Sparse-aware BCNMCC algorithms for sparse identification under noisy input and output noise with non-Gaussian characteristics havesystem been developed in this paper, namely, the output noise with non-Gaussian characteristics have been developed in this paper, namely, the
Entropy 2018, 20, 407
14 of 15
5. Conclusions Sparse-aware BCNMCC algorithms for sparse system identification under noisy input and output noise with non-Gaussian characteristics have been developed in this paper, namely, the CIM-BCNMCC and BCPNMCC algorithms. They can achieve a higher identification accuracy of system parameters due to the introduction of the CIM penalty and the proportionate update scheme. In particular, the proposed BCPNMCC algorithm can also provide a faster convergence speed than the BCNMCC algorithm because of the adaptive step size adjustment by the gain matrix. More importantly, the BCPNMCC algorithm can lead to considerable improvements in the noisy input case for the SSI problem because it takes advantage of the bias-compensated term derived by the unbiasedness criterion. The simulation results demonstrate that (1) the CIM-BCNMCC and BCPNMCC algorithms perform better than other conventional algorithms; (2) the BCPNMCC algorithm outperforms CIM-BCNMCC when they are used for sparse system identification; and (3) no matter the step size, input noise’s variance, and kernel width, the BCPNMCC algorithm can always achieve the best performance. Author Contributions: W.M. proposed the main idea and wrote the draft; D.Z. derived the algorithm; Z.Z. and J.D. ware in charge of technical checking; J.Q. and X.H. performed the simulations. Acknowledgments: This work was supported by the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2017JM6033), Scientific Research Program Funded by Shaanxi Provincial Education Department (No. 17JK0550), State Key Laboratory of Electrical Insulation and Power Equipment (EIPE18201) and Science and Technology Project of China Huaneng Group (HNKJ13-H20-03). Conflicts of Interest: The authors declare no conflict of interest.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
Kalouptsidis, N.; Mileounis, G.; Babadi, B. Adaptive algorithms for sparse system identification. Signal Process. 2011, 91, 1910–1919. [CrossRef] Slock, D.T.M. On the convergence behavior of the LMS and the normalized LMS algorithms. IEEE Trans. Signal Process. 1993, 41, 2811–2825. [CrossRef] Walach, E.; Widrow, B. The least mean fourth (LMF) adaptive algorithm and its family. IEEE Trans. Inf. Theory 1984, 30, 275–283. [CrossRef] Zerguine, A. Convergence and steady-state analysis of the normalized least mean fourth algorithm. Digit. Signal Process. 2007, 17, 17–31. [CrossRef] Liu, W.; Pokharel, P.; Principe, J.C. Correntropy: Properties and applications in non-Gaussian signal processing. IEEE Trans. Signal Process. 2007, 55, 5286–5298. [CrossRef] Chen, B.; Xing, L.; Liang, J.; Zheng, N.; Principe, J.C. Steady-state mean-square error analysis for adaptive filtering under the maximum correntropy criterion. IEEE Signal Process. Lett. 2014, 21, 880–884. Chen, B.; Principe, J.C. Maximum correntropy estimation is a smoothed MAP estimation. IEEE Signal Process. Lett. 2012, 19, 491–494. [CrossRef] Chen, B.; Wang, J.; Zhao, H.; Zheng, N.; Principe, J.C. Convergence of a fixed-point algorithm under maximum correntropy criterion. IEEE Signal Process. Lett. 2015, 22, 1723–1727. [CrossRef] Chambers, J.; Avlonitis, A. A robust mixed-norm adaptive filter algorithm. IEEE Signal Process. Lett. 1997, 4, 46–48. [CrossRef] Breidt, F.J.; Davis, R.A.; Trindade, A.A. Least absolute deviation estimation for all-pass time series models. Ann. Stat. 2001, 29, 919–946. Ma, W.; Chen, B.; Duan, J.; Zhao, H. Diffusion maximum correntropy criterion algorithms for robust distributed estimation. Digit. Signal Process. 2016, 155, 10–19. [CrossRef] Wu, Z.; Shi, J.; Zhang, X.; Chen, B. Kernel recursive maximum correntropy. Signal Process. 2015, 117, 11–16. [CrossRef] Chen, B.; Xing, L.; Zhao, H.; Zheng, N.; Principe, J.C. Generalized correntropy for robust adaptive filtering. IEEE Trans. Signal Process. 2016, 64, 3376–3387. [CrossRef] Wang, Y.; Li, Y.; Albu, F.; Yang, R. Group-constrained maximum correntropy criterion algorithms for estimating sparse mix-noised channels. Entropy 2017, 19, 432. [CrossRef]
Entropy 2018, 20, 407
15. 16. 17. 18. 19. 20. 21. 22.
23. 24. 25. 26. 27.
28. 29. 30.
31. 32. 33. 34. 35.
15 of 15
Wu, Z.; Peng, S.; Chen, B.; Zhao, H. Robust Hammerstein adaptive filtering under maximum correntropy criterion. Entropy 2015, 17, 7149–7166. [CrossRef] Li, Y.; Wang, Y.; Yang, R.; Albu, F. A soft parameter function penalized normalized maximum correntropy criterion algorithm for sparse system identification. Entropy 2017, 19, 45. [CrossRef] Donoho, D.L. Compressed sensing. IEEE Trans. Inf. Theory 2006, 52, 1289–1306. [CrossRef] Baraniuk, R.G. Compressive sensing. IEEE Signal Process. Mag. 2007, 24, 21–30. [CrossRef] Duttweiler, D.L. Proportionate normalized least-mean-squares adaptation in echo cancelers. IEEE Trans. Speech Audio Process. 2000, 8, 508–518. [CrossRef] Gu, Y.; Jin, J.; Mei, S. l0 norm constraint LMS algorithm for Sparse system identification. IEEE Signal Process. Lett. 2009, 16, 774–777. Gui, G.; Adachi, F. Sparse least mean fourth algorithm for adaptive channel estimation in low signal-to-noise ratio region. Int. J. Commun. Syst. 2014, 27, 3147–3157. [CrossRef] Ma, W.; Qu, H.; Gui, G.; Xu, L.; Zhao, J.; Chen, B. Maximum correntropy criterion based sparse adaptive filtering algorithms for robust channel estimation under non-Gaussian environments. J. Franklin Inst. 2015, 352, 2708–2727. [CrossRef] Li, Y.; Wang, Y.; Albu, F.A.; Jiang, J. General Zero Attraction Proportionate Normalized Maximum Correntropy Criterion Algorithm for Sparse System Identification. Symmetry 2017, 9, 229. [CrossRef] Li, Y.; Jin, Z.; Wang, Y.; Yang, R. A robust sparse adaptive filtering algorithm with a correntropy induced metric constraint for broadband multipath channel estimation. Entropy 2016, 18, 380. [CrossRef] Ma, W.; Chen, B.; Qu, H.; Zhao, J. Sparse least mean p-power algorithms for channel estimation in the presence of impulsive noise. Signal Image Video Process. 2016, 10, 503–510. [CrossRef] Sayin, M.O.; Yilmaz, Y.; Demir, A. The Krylov-proportionate normalized least mean fourth approach: Formulation and performance analysis. Signal Process. 2015, 109, 1–13. [CrossRef] Ma, W.; Zheng, D.; Zhang, Z. Robust proportionate adaptive filter based on maximum correntropy criterion for sparse system identification in impulsive noise environments. Signal Image Video Process. 2018, 12, 117–124. [CrossRef] Jung, S.M.; Park, P.G. Normalized least-mean-square algorithm for adaptive filtering of impulsive measurement noises and noisy inputs. Electron. Lett. 2013, 49, 1270–1272. [CrossRef] Yoo, J.W.; Shin, J.W.; Park, P.G. An improved NLMS algorithm in sparse systems against noisy input signals. IEEE Trans. Circuits Syst. II Express Briefs 2015, 62, 271–275. [CrossRef] Ma, W.; Zheng, D.; Tong, X.; Zhang, Z.; Chen, B. Proportionate NLMS with Unbiasedness Criterion for Sparse System Identification in the Presence of Input and Output Noises. IEEE Trans. Circuits Syst. II Express Briefs 2017. [CrossRef] Zheng, Z.; Liu, Z.; Zhao, H. Bias-compensated normalized least-mean fourth algorithm for noisy input. Circuits Syst. Signal Process. 2017, 36, 3864–3873. [CrossRef] Zhao, H.; Zheng, Z. Bias-compensated affine-projection-like algorithms with noisy input. Electron. Lett. 2016, 52, 712–714. [CrossRef] Ma, W.; Zheng, D.; Li, Y.; Zhang, Z.; Chen, B. Bias-compensated normalized maximum correntropy criterion algorithm for system identification with noisy Input. arXiv, 2017. Shin, H.C.; Sayed, A.H.; Song, W.J. Variable step-size NLMS and affine projection algorithms. IEEE. Signal Process. Lett. 2004, 11, 132–135. [CrossRef] Jo, S.E.; Kim, S.W. Consistent normalized least mean square filtering with noisy data matrix. IEEE Trans. Signal Process. 2015, 53, 2112–2123. © 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).