EURASIP Journal on Applied Signal Processing 2001:4, 257–265 © 2001 Hindawi Publishing Corporation
Reduced Complexity Volterra Models for Nonlinear System Identification Rfat Hacolu Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL 60616, USA Email:
[email protected]
Geoffrey A. Williamson Department of Electrical and Computer Engineering, Illinois Institute of Technology, Chicago, IL 60616, USA Email:
[email protected] Received 27 July 2001 and in revised form 28 September 2001 A broad class of nonlinear systems and filters can be modeled by the Volterra series representation. However, its practical use in nonlinear system identification is sometimes limited due to the large number of parameters associated with the Volterra filter’s structure. The parametric complexity also complicates design procedures based upon such a model. This limitation for system identification is addressed in this paper using a fixed pole expansion technique (FPET) within the Volterra model structure. The FPET approach employs orthonormal basis functions derived from fixed (real or complex) pole locations to expand the Volterra kernels and reduce the number of estimated parameters. That the performance of FPET can considerably reduce the number of estimated parameters is demonstrated by a digital satellite channel example in which we use the proposed method to identify the channel dynamics. Furthermore, a gradient-descent procedure that adaptively selects the pole locations in the FPET structure is developed. Keywords and phrases: nonlinear system identification, Volterra model structure, orthonormal basis functions.
1.
INTRODUCTION
The identification of nonlinear dynamical systems from a given input-output data set has attracted considerable interest since many physical systems exhibit nonlinear characteristics. The Volterra model structure [1] can be used to represent a broad class of nonlinearities [2]. The output y(k) of a discrete-time, time-invariant truncated N th order Volterra model with input sequence u(k) is y(k) = h(0) +
N
∞
n=1 l1 ,...,ln =0
(n) hl1 ···ln u k−l1 · · · u k−ln , (1)
where {h(n) l1 ···ln } is called the nth order Volterra kernel. In practice, the infinite sum in (1) may be truncated to a finite number if the system has fading memory [2].1 It has been shown in [2] that any time-invariant nonlinear system with 1 A nonlinear system or filter has fading memory if the system output’s dependence on past inputs fades out when time goes to the remote past. Many stable systems possess this property, though some broad class of systems, such as those with hysteresis effects, do not. See [2] for a detailed definition of fading memory.
fading memory can be well approximated by a finite Volterra series representation to any precision. Hence, the class of truncated Volterra series models is attractive to use in nonlinear system identification. Volterra filters are very simple to use and have nice properties. For instance, they are linear in the parameters and hence standard and well-behaved parameter estimation techniques can be used. Also, the Volterra series representation is a natural nonlinear extension to a linear impulse model. However, the large number of parameters associated with the Volterra models limit their practical utility to problems involving only modest values of memory and/or model order. This limitation arises not only because the identification of the large number of parameters may be problematic, but also design procedures based upon such models may also be cumbersome. Therefore, it is desirable to reduce the number of free parameters in the Volterra models in situation when we have high memory value and/or high order Volterra models. To address this issue, we develop a fixed pole expansion technique (FPET) [3] amenable to least mean square (LMS) based or least squares (LS) based nonlinear system identification. The FPET approach employs orthonormal basis functions derived from fixed pole locations [4] to expand the
258
EURASIP Journal on Applied Signal Processing
Volterra kernels. A good choice for the pole locations is one for which the approximation error of the truncated series expansion decreases rapidly with model order. Such a choice enables a lower order, and hence reduced complexity, model for a desired level of approximation error. The idea of expanding the kernels of Volterra filters is originally described by Wiener in [5], where Laguerre (single real pole) functions were used as the basis for the kernels. Marmarelis in [6] used this idea using discrete-time Laguerre functions. In this paper, we extend the approach to include orthonormal basis functions using multiple fixed poles. Our motivation is based on the idea that the choice of the basis functions can greatly enhance the ability of the model in describing the nonlinear dynamics. Furthermore, we develop a gradient-descent procedure that adaptively selects the fixed pole locations in the FPET structure to best fit a given data set. For a detailed overview on the Volterra model structure the reader is referred to the existing literature (cf. [1, 2, 5, 6, 7]). We also note here that our approach of reducing model complexity through decreasing the required Volterra system order is only one way to address the complexity issue. One recent alternative is that of [8, 9]. There, the Volterra model is approximated using a parallel-cascade structure in which each parallel connection of multiplicative combinations have enabled reduction of the overall complexity. Such an approach may also be applied to the FPET structure in addition to the more traditional Volterra structure. The organization of this paper is as follows. In Section 2, we develop the FPET structure through a generalization of a block structure representation of Volterra filters and establish its equivalence. Section 3 defines the fixed pole basis functions for the FPET structure. Section 4 illustrates identification and pole selection approach for FPET. In order to show some features of identification and pole selection approach, we consider in Section 5 an example that can be described in FPET form. In Section 6, we give a simplified version of a linear-nonlinear-linear (LNL) structured digital satellite channel example in order to show the FPET performance. Finally, we draw our conclusions in Section 7. 2.
FIXED POLE EXPANSION TECHNIQUE
Let {gi ()} be a sequence of impulse response functions that form a basis for the linear space of stable, time-invariant systems. Then the Volterra model class of (1) is equivalent to the set of nonlinear systems described by y(k) = b(0) +
N
∞
(n)
n=1 l1 ,...,ln =0
bl1 ···ln xl1 (k) · · · xln (k),
(2)
-
G1 (z)
-
G2 (z)
x0 (k)
-
x1 (k)
-
M -Input,
-
u(k)
no memory nonlinear
. . .
-
system
-
xi (k) =
∞
summation
y(k) = b(0) +
N
M−1
(n)
n=1 l1 ,...,ln =0
bl1 ···ln xl1 (k) · · · xln (k).
(3)
is the output of the linear system with impulse response {gi+1 ()} (equivalently, transfer function Gi+1 (z)) driven by the input signal u(k). If we truncate this expansion to include
(4)
The model class of truncated systems can be represented as shown in Figure 1. There we have a single-input, multioutput linear filter bank whose outputs are used to generate monomial terms which are then weighted and summed. The unknown weight parameters {bl(i) } in (4) can be estimated 1 ···ln in practice by linear regression of the output data y(k) on the terms of the multinomial expansion of (4), as long as they are finite. One can see in Figure 1 that if Gi (z) = z−(i−1) , the Volterra series representation is recovered. Note that the total number of weight parameters for the N th order (or Volterra model structure in (1)) expansion of (4) is (M N+1 − 1)/(M − 1). Without loss of generality, one can assume that the kernel parameters are symmetric, that is, (n) (n) {bl1 ···ln } in (4) (or {hl1 ···ln } in (1)) is left unchanged for any of the n! permutations of the indices l1 , . . . , ln . In this case the total number of parameters reduces to (M + N)!/M!N!. We, in this paper, represent the input-output relation as in (4) (or in (1)), in keeping with the standard practice in the literature, but when we remark on the total number of required parameters, we use the smaller number from the second formula. The equivalence of (2) to (1) follows from {gi ()} forming a basis and from the maximum order of the polynomial terms equalling N in both cases. For instance, putting (3) into (4) will give us y(k) = b
(0)
+
N
∞
M−1 l1 ,...,ln =0
(n) bl1 ···ln gl1 +1 k1
· · · gln +1 kn
=0
-
y(k)
only a finite number of terms, we obtain
gi+1 ()u(k − )
&
Figure 1: Block structure of Volterra models.
n=1 k1 ,...,kn =0
where
. . .
weight
. . . -
xM−1 (k)
GM (z)
. . . . . . -
(5)
× u k − k 1 · · · u k − kn ,
showing that the relationship between the Volterra kernels (n) (n) {hk1 ···kn } in (1) and the weight parameters {bl1 ···ln } in (4)
Reduced complexity volterra models for nonlinear systems and filters can be written as M−1
(n)
hk1 ···kn =
l1 ,...,ln =0
u(k)
(n) bl1 ···ln gl1 +1 k1 · · · gln +1 kn .
The previous section shows that we may approximate the input-output behavior of any fading memory nonlinear system by a linear combination of M fixed linear systems G1 (z), . . . , GM (z) and nonlinear interactions of their outputs. The adjustable parameters {bl(n) } are chosen to find 1 ···ln the best linear combination of orthonormal basis functions G1 (z), . . . , GM (z). Our goal is to choose G1 (z), . . . , GM (z) so that a good model may be obtained with small M . To this end we consider the set of orthonormal basis functions
x0 (k) -
-
L2 q−1
x1 (k) -
-
LM q−1
xM−1 (k) -
. . . HM−1 q−1
-
Figure 2: Realization of fixed pole filter bank.
Definition 3.2. The sequence of fixed complex pole basis functions {Gk (z)} in (7) forms an orthonormal set by the choice of L2k−1 (z) =
L2k (z) =
2 1 − β k β∗ 1 − δ2k k
z − βk
z
2 z z − δk , 1− βk β∗ k z−βk z−β∗ k
z − β∗ k
,
k = 1, 2, . . .
(10) Hi (z),
k = 1, 2, . . . , M
(7)
i=1
∗ with δk = (βk + β∗ k )/(1 + βk βk ) and
by specified choice of Lk (z) and Hi (z), i = 1, . . . , k − 1. Orthonormality implies
L1 q−1
unit circle. Here Lk (z) = ( 1 − α2k z)/(z − αk ), Hi (z) = (1 − αi z)/(z − αi ) in (7).
3. FIXED POLE BASIS FUNCTIONS
Gk (z) = Lk (z)
H1 q−1
-
k−1
-
(6)
A sequence of such finite term expressions of the form of (4) converging to the infinite term expression in (2) thus yields the corresponding convergent sequence of expressions of the form of (5) and hence (1). Similarly one may construct expressions to obtain the kernels in (2) from the kernels of the equivalent Volterra model. Because of the equivalence of the Volterra model class and the proposed model class of (2), then, following the results of [2], any time-invariant nonlinear system with fading memory can be approximated to within arbitrary accuracy by a model of the form (4).
1 2π j
259
Gm (z)Gn z
−1
dz z
=
1,
m = n,
0,
m ≠ n,
(8)
where the contour integral around the unit circle is analytic in the exterior of the circle. (Equivalently, the impulse response functions {gm ()} associated with {Gm (z)} are orthogonal in the 2 sense.) Here the filter Lk (z) and all-pass filter Hi (z) are first (second) order filters having fixed real (complexconjugate) poles. The choice of fixed poles for Lk (z) and Hi (z) motivates the name of fixed pole expansion technique, or FPET. We specify the orthonormal basis functions in (7) using the approach taken in [10, 11] for either real poles or complexconjugate pole pairs. Definition 3.1. Fixed real pole basis functions can be defined as k−1 1 − αi z z , z − αk i=1 z − αi
Gk (z) = 1 − α2k
(9)
where k = 1, 2, . . . , M if the poles are real numbers in the
H2i−1 (z) = H2i (z) =
1 − βi z 1 − β ∗ i z , z − βi z − β∗ i
(11)
where the complex conjugate pole pairs, (βk , β∗ k ) are in the unit circle. (H1 (z) = 1 in (11).) Figure 2 diagrams the realization of the fixed real and/or complex pole filter banks defined above. Note here that if αk = α for all k (only one real pole location) in (9) then {Gj (z)} are called Laguerre functions in [6, 10, 11], and also ∗ if βk = β, β∗ k = β (one set of complex-conjugate pole locations) in (10) and (11), then {Gj (z)} are called Kautz functions in [11]. Note also that if αk = 0 then (4) becomes the truncated Volterra filter with a finite memory length. One of the most important aspect of the Volterra representation of nonlinear systems is that the kernel parameters are linearly related to the output. Therefore, given a finite set of input and output measurements, estimates of the Volterra kernel parameters can be obtained using least squares (LS) for preselected fixed poles. 4.
IDENTIFICATION AND POLE SELECTION ALGORITHM
In this section, we consider the identification of the weight parameters in (4). Additionally, we address the problem of
260
EURASIP Journal on Applied Signal Processing
specifying the location of the fixed poles of the expansion. As noted earlier, the intent of the FPET approach is to enable reduction of the overall model order through a “good” selection of the fixed poles. For situations in which a selection of fixed pole locations is not possible via a priori information, we develop a gradient descent based scheme for selecting the pole locations for the M basis functions. Given a set of measured input and output data {u(k), y(k), k = 0, . . . , K − 1}, we consider the following squared error cost function 2 1 ˆ , y(k) − y(k) J(k) = 2
k = 0, . . . , K − 1,
(12)
ˆ is an estimate of the measured output y(k). where y(k) ˆ(n) } and pole The estimates of the weight parameters {b l1 ···ln ˆi } may be defined as the minimizing argument parameters {λ of either the quadratic error criterion E{J(k)} (an LMS criterion, for which we assume a stationary random environment) or (1/K) K−1 k=0 J(k) (an LS criterion). If we have a real-valued ˆi is α ˆ i . Instead of workpole then the estimate of the ith pole λ ing in the complex field, we express the estimate of complex ˆi as a real parameter vector with real and pole parameters λ imaginary parameters of the complex pole estimate βˆi . Here we develop an iterative estimation algorithm using the LMS formalism, taking the standard approach of following an approximate gradient descent [12] using the gradient of J(k) in (12). The instantaneous gradient of J(k) with reˆ(n) } is spect to the weight parameter {b l1 ···ln
(13)
ˆl1 (k) · · · x ˆln (k) y(k) − y(k) ˆ = −x .
The instantaneous gradient of J(k) with respect to the ith pole is ∂J(k)
=−
ˆ(n) (k − 1) ˆ(n) (k) = b b l1 ···ln l1 ···ln +
ˆl1 (k) · · · x ˆln (k) µb x ˆ , 2 y(k) − y(k) ˆl1 (k) · · · x ˆln (k)
+ x
ˆi (k) = λ ˆi (k − 1) + µλ λ
ˆ ∂ y(k) ˆ y(k) − y(k) , ˆ ∂ λi (k − 1)
(16) where µb and µλ are positive adaptation stepsizes for weight parameters and pole parameters, respectively, and is a small, positive constant. The combination of a normalized algorithm for the weight parameters and an unnormalized algorithm for the pole parameters aids convergence. The estimate ˆi (k)| < 1 for ensuring ˆi (k) is subjected to |α ˆ i (k)| < 1, |β λ the stability. Other algorithms such as Gauss-Newton algorithm can also be used to find the pole location as well as the weight parameters. To illustrate the development of (15), we give a second order FPET model estimated output with two real poles so that ˆ(0) + b ˆ(1) x ˆ(1) ˆ1 (k) ˆ y(k) =b 0 ˆ0 (k) + b1 x 2 2 ˆ(2) x ˆ(2) ˆ0 (k)x ˆ(2) x ˆ1 (k) + b +b 11 ˆ1 (k), 00 ˆ0 (k) + 2b01 x
ˆ ∂ y(k) ˆi (k − 1) ∂λ
ˆ y(k) − y(k) ,
(14)
where
where the estimated filter banks outputs are 1 u(k), ˆ 1 q−1 1−α ˆ 1 + q−1 −α u(k). ˆ1 (k) = 1 − α ˆ 22 x ˆ 1 q−1 1 − α ˆ 2 q−1 1−α
=
(18)
If we take the derivative with respect to estimated fixed poles, ˆ 1 and α ˆ 2 , we get in matrix form α ∂ y(k) ˆ
∂x T ˆ0 (k) ∂ x ˆ1 (k) b ˆ(1) b ˆ(1) 0 1 ∂α ˆ1 ˆ 1 (2) ∂α ˆ1 ∂α ˆ ˆ(2) = 2b 2b 00 01 ∂ y(k) ∂x ˆ ˆ0 (k) ∂ x ˆ1 (k) (2) (2) ˆ ˆ 2b01 2b11 ˆ2 ˆ2 ˆ2 ∂α ∂α ∂α
ˆ ∂ y(k)
1 ˆ0 (k) . x ˆ1 (k) x
(19)
ˆi (k − 1) ∂λ
(17)
ˆ0 (k) = 1 − α ˆ 21 x
∂J(k) ˆ(n) (k − 1) ∂b l1 ···ln
ˆi (k − 1) ∂λ
A simple recursive procedure based on this gradient is the normalized LMS [12]
ˆ0 (k) ∂x ˆi (k − 1) ∂λ
,...,
ˆM−1 (k) ∂x ˆi (k − 1) ∂λ
(15)
ˆ N−1 (k) B Tˆ ϕ θ
and where Bθˆ is an M × − 1)/(M − 1)) matrix whose } in (4). elements are the estimates of the parameters {bl(n) 1 ···ln ˆ N−1 (k) is a ((M N − 1)/(M − 1)) × 1 vector whose elements ϕ are filter bank outputs xi (k) and their nonlinear interactions (e.g., xi (k)xj (k)) up to order N − 1. It is straightforward to find the partial derivative of the linear filter bank outputs ˆi (k − 1) with respect to fixed pole estimates from ˆj (k)/∂ λ ∂x (9), (10), and (11) using standard techniques, for example, [13]. This will be illustrated below. ((M N
ˆ1 (k) can be approxiˆ0 (k) and x The partial derivatives of x mated as (see [13]) ˆ0 (k) ∂x = ˆ1 ∂α
1 ˆ 21 1−α
ˆ 1 + q−1 −α ˆ0 (k), x ˆ 1 q−1 1−α
ˆ0 (k) ∂x = 0, ˆ2 ∂α
ˆ 22 / 1 − α ˆ 21 − 1−α ˆ1 (k) q−1 ∂x ˆ1 (k) + ˆ0 (k), x x = −1 −1 ˆ1 ˆ1q ˆ2q ∂α 1−α 1−α ˆ 2 + q−1 ˆ1 (k) −α 1 ∂x ˆ1 (k). x = 2 ˆ2 ˆ 2 q−1 ∂α ˆ2 1 − α 1−α
(20)
Reduced complexity volterra models for nonlinear systems and filters -
√
ˆ0 (k) x
ˆ 21 1−α ˆ 1 q−1 1−α
-
ˆ 1 +q−1 −α ˆ 21 )(1−α ˆ 1 q−1 ) (1−α
ˆ0 (k) ∂x
- ∂αˆ
1
ˆ0 (k) ∂x ˆ2 ∂α
-
ˆ 1 +q−1 −α ˆ 1 q−1 1−α
-
√
ˆ 22 1−α ˆ 2 q−1 1−α
√
−
=0
√
ˆ 22 / 1−α ˆ2 1−α
ˆ1 (k) x
-
-
ˆ 21 1−α q−1
q−1 ˆ 1 q−1 1−α
h - +? -
ˆ 2 +q−1 −α ˆ 22 )(1−α ˆ 2 q−1 ) (1−α
ˆ1 (k) ∂x ˆ1 ∂α
u(k) -
G1 (z)
G2 (z)
G3 (z)
G4 (z)
-
1.02
-
−0.12
-
0.15
-
−0.07
x0 (k)
x1 (k)
x2 (k)
x3 (k)
@ R ? x(k) y(k) +m - 0.1 + x + 0.5x 2 + x 3
ˆ1 (k) ∂x
- ∂αˆ
2
Figure 3: Realization of sensitivity function for two real poles.
Each partial derivative is generated via linear filtering of the filter bank outputs using the estimated pole locations. This is shown in Figure 3. The approximate gradient descent procedure described above will converge to a local minimum of the cost function error surface as long as the true gradient is well approximated. There are thus two major issues regarding the asymptotic convergence behavior of the algorithm: (a) how well the true gradient is approximated, and (b) the nature and number of local minima of the cost function. Concerning (a), we will not conduct here a detailed analysis, but make instead the following observations. At any time k, the constraints on the pole locations keep the equivalent “frozen” system stable. With a slowly time-varying system (slow adaptation), the internal signals remain bounded. Convergence of the approximate gradient descent then follows itself on an adequately small adaptation gain (step size). This is indeed what we observe for appropriately chosen step sizes. One should also note that ˆi ), the algorithm with fixed pole locations (no adaptation of λ is a simple linear regression and is well behaved. The nature of the error surface is more problematic. As is typical for filter structures in which poles are adapted, multimodal error surfaces are possible [14]. In this nonlinear system environment, the likelihood of multimodal error surfaces apparently increases. It is also possible to have multimodal error surfaces even in the “sufficient order” case in which the model structure is able to represent the unknown system exactly with zero error (the example in the next section demonstrates this). Nonetheless, with judicious initializations of the algorithm convergence to the global minimum, or to a significantly deep local minimum, often occurs. Some features of the error surface geometry are illustrated in the following section. 5.
-
?
u(k)
261
SIMPLE ILLUSTRATIVE EXAMPLE
In this section, we consider a simple case in which we use an adaptive FPET to identify a nonlinear system that can be described in FPET form. We use the example to illustrate the performance gains achievable by the FPET approach,
Figure 4: Unknown system for example of Section 5.
and also to illustrate some features of a characteristic error surface. The nonlinear system to be identified is depicted by the Wiener block structure (linear filter following by memoryless nonlinearity) shown in Figure 4. The basis functions, Gi (z), i = 1, . . . , 4 are given by (9) with α1 = α3 = 0.5 and α2 = α4 = 0.65. Note that this system may be exactly described by an FPET of the form of (4) with N = 3 and M = 4, using only two fixed poles at z = 0.5 and z = 0.65. An independent, identically distributed (i.i.d.), zero mean, white Gaussian 15, 000 samples input signal with unit variance is used to generate the output data set in this example. Adaptation stepsizes, µb = 0.2 and µλ = 0.005 are used for the adaptive algorithm in (16). The performance measure, normalized mean square error (NMSE) is defined as 10 log10
K−1
2
ˆ y(k) − y(k) 2 K−1 (1/K) k=0 y(k)
(1/K)
k=0
.
(21)
The model structure to be used for the identification also has N = 3 and M = 4, but we restrict consideration to two real poles α1 and α2 to define the basis functions. Each pole is thus repeated twice. Note that because of the repeated nature of the poles in the unknown system, this model structure is capable of describing that system exactly. Figure 5 shows the error surface and contour plot for these two real, fixed poles in terms of NMSE. Note that the solid line (when α1 = α2 ) is the error surface for a single real pole (Laguerre) in this figure. The global minimum for these two fixed real pole is at the optimum α1 = 0.5, α2 = 0.65, where the cost is zero. If we use Laguerre basis function then the global minimum is at α = 0.58 as shown in Figure 5. Even though exact modeling case is possible, we observe a multimodal error surface. Using the iterative algorithm described above we find the results shown in Table 1. The table shows for several different model structures the initial pole parameters at the start of adaptation, the pole parameters at convergence, and the achievable NMSE at convergence. The single real pole (Laguerre) converges towards an unavoidable local minimum when the initial condition is 0. If two optimal fixed poles are chosen for the FPET, then we will have a reduction of
262
EURASIP Journal on Applied Signal Processing
NMSE (in dB)
n(k)
u(k)
HB (q)
0.8
so (k)
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2
1
si (k)
s0
0 10 20 30 40 50 60 70 80 1
1.1
0.2
0.4
0.6
0.8 s1 i
1.2
1.4
1.6
HC (q)
+
y(k)
1.8
3rd order Volterra filter
ˆ y(k) −
ˆ y(k) − y(k)
+
0.6 α2
0.4 0.2
0
0.2
0
0.6
0.4
0.8
1
Figure 6: Application of Volterra filter to identify a nonlinear satellite channel.
α1
Figure 5: Error surface for two real poles.
Table 1: NMSE values for example of Section 5. Fixed pole(s) NMSE (in dB)
Model class selection Initial
Converged
Volterra
( M = 4) 0
0
−14.48
Laguerre
(M = 4 ) 0
0.372
−33.49
0.582
−49.14
0.7
Fixed real pole
0.5, 0.55 0.504, 0.658 −66.47
(M = 4) 0.7, 0.65 0.675, 0.492 −58.96 Volterra
(M = 20) 0
0
−52.15
approximately 17 dB in the NMSE criterion compared to Laguerre and a reduction of about 51 dB compared to truncated Volterra with the same parameter complexity. Even with an initialization that leads to convergence to a local minimum (the second fixed real pole initialization in Table 1), the resulting NMSE decrease is significant compared to the Volterra model. The Volterra kernel parameters become negligible for this example when M reaches about 20, and the achievable reduction in NMSE reaches 52 dB, roughly comparable to the NMSE for the suboptimal, two pole, M = 4 FPET case. It is illustrative to compare the parametric complexity of the M = 20 Volterra model with the M = 4 two pole FPET. In the case of the M = 20 Volterra model, we have 1, 771 parameters to identify (excluding redundant parameters as noted in Section 2), while there are only 35 parameters in the FPET case. Hence the reduction in the parametric complexity is significant when we use FPET approach in this example. 6.
EXAMPLE: NONLINEAR DIGITAL SATELLITE CHANNEL
To illustrate the basic algorithm behavior and the performance in a nonideal case we consider the problem of identi-
fying a simplified model of a digital nonlinear satellite channel [15] (see Figure 6) that can be represented by Volterra model structure [16]. Specifically, the channel filters u(k) by a low-pass linear filter, denoted as HB (z), then passes the signal through a memoryless nonlinear device, and in the last stage passes the signal through another low-pass filter HC (z). In this example HB (z) is a Butterworth filter and HC (z) a Chebyshev filter, each of fourth order, with cutoff frequencies both 0.1 cycles/sample. The memoryless nonlinear device has an input-output characteristic shown in Figure 6. A detailed definition of HB (z) and HC (z) is given in [15] and the inputoutput relation of the memoryless device has been obtained from [16, 17] by interpolating with third order polynomial. The input signal u(k) is i.i.d., uniformly distributed on the interval [0.12, 1.78]. For such an input, the nonlinear device operates in the nonlinear region for most of the time. A white Gaussian noise n(k) has been added to this output to give an SNR of 50 dB. Using 15, 000 samples of input/output data, we fit FPET models to this system in the following cases (the polynomial order is N = 3 in all cases): • FPET with M = 8 – all poles at z = 0 (Volterra filter) – all poles at z = 0.5 (Laguerre filter) – all poles at z = 0.7 ± j0.3 (Kautz filter) – all poles at z = 0.7 ± j0.4 and z = 0.7 ± j0.1 (FPET, two complex pole pairs) • Volterra filter (all poles at z = 0) with M = 20. The pole locations noted above were chosen to minimize the NMSE for the identification over choice of poles within the model structure over a grid in the z-plane with 0.1 spacing. Table 2 shows the achieved NMSE values. Notice that a good choice of poles in the FPET has significant advantage in terms of NMSE versus truncated third order Volterra model with same parameter complexity M = 8. For the Volterra model to achieve comparable performance, the number of poles must be increased to M = 20. For M = 8 the number of identified parameters is 165 while for M = 20 there are 1, 771 identified parameters, so that the savings in parametric complexity is quite significant. The values for the best pole positions in Table 2 were found by trial and error. The adaptive algorithm in Section 4 can be used to determine a good value on-line. We apply the
Reduced complexity volterra models for nonlinear systems and filters
263
Table 2: NMSE for different models of nonlinear satellite channel. Model class selection
Fixed pole(s) NMSE (in dB)
Volterra
( M = 8) 0
−30.31
Laguerre
(M = 8) 0.5
−40.35
Kautz
(M = 8) 0.7 ± 0.3j
−43.66
Fixed complex poles (M = 8) 0.7 ± 0.4j
−45.01
NMSE (in dB)
−20 −30 −40 −50 −60 1
0.6 0.4 Im{β} 0.2
0.7 ± 0.1j
Volterra
(M = 20) 0
0.8
0 0
−44.20
0.2
0.4
1
0.8
0.6 Re{β}
Figure 7: Error surface for a complex pole location. Table 3: NMSE for different models of nonlinear satellite channel when the adaptive algorithm is used to find the pole locations.
Model class selection Initial Laguerre
( M = 8) 0 0.8
Kautz
Converged
NMSE (in dB)
0.485
−40.38
0.614
−40.50
h1 (I1 )
0.4
Fixed pole(s)
0.2
0
(M = 8) 0.52 ± 0.01j 0.65 ± 0.29j −44.78 −0.2
5
10
15 20 25 Discrete time lag, I1
30
35
40
0
5
10
15 20 25 Discrete time lag, I1
30
35
40
0.4
h1 (I1 )
algorithm in the case of the Laguerre structure (all poles at one value z = α) as well as in the case of the Kautz structure (all poles at one complex value z = β) with the pole location adjusted via the algorithm. Adaptation stepsizes, µb = 0.1 and µλ = 0.04 are used in this example. Table 3 shows the achieved NMSE values when we use adaptive algorithm to locate the pole locations. Even though initialization of Laguerre pole parameter leads to convergence to a local minimum as shown in Table 3, the resulting NMSEs are smaller compared to Laguerre model in Table 2. Notice that Kautz parameter converges to a complex pole with a 1 dB reduction compared to the achieved NMSE with the Kautz model in Table 2. (Recall that the value in Table 2 was optimized over a coarse spacing of 0.1 in candidate pole locations.) For the purpose of illustration, Figure 7 shows how the least-square error criterion of the single complex (Kautz) pole depends on the real and imaginary part of β. Notice that solid line here is the error surface for Laguerre parameter (when we only have real pole, Im(β) = 0). Note also that the leastsquare error criterion has multiple minima as shown in this figure. Therefore, we may have local convergence depending on the initial value for fixed pole location. One may compare the different models as well as the true system by looking at kernels of the equivalent Volterra representation of each system. The kernels may be calculated as in (6) by using knowledge of the estimated weight parameters and the basis functions. Figure 8 shows the first order Volterra kernel and its estimates for different pole selections. After 40 time lags the kernel coefficients of the true system description become negligible. The FPET with two, fixed complex pole pairs does a good job of estimating these first 40 kernel
0
0.2
0
−0.2
Figure 8: Nonlinear satellite channel example, first order Volterra kernel (solid line) and its estimates. Top trace: (..) Volterra (M = 4), (.) Volterra (M = 20), (-.) Laguerre α = 0.5, (M = 4), (- -) Laguerre α = 0.614, (M = 4). Bottom trace: (-.) Kautz β = 0.7 ± 0.3j, (M = 4), (- -) Kautz β = 0.65 ± 0.29j, (M = 4), (..) fixed complex pole β1 = 0.7 ± 0.4j, β2 = 0.7 ± 0.1j, (M = 4).
coefficients. Figure 9 shows the second order Volterra kernel (top trace) and its estimate using the FPET with the two, fixed complex pole pairs as well as the difference between these two in bottom trace. Note the good fit achieved in the second order kernels. ˆ Figure 10 shows convergence curves for the real pole α(k) and NMSE averaged over several trails given two different initial conditions. The adaptive algorithm successfully loˆ cates the local minimum (when α(0) = 0), and the global
264
EURASIP Journal on Applied Signal Processing
0.8
0 −0.02 −0.04 −0.06 40
α(k)
h2 (I1 , I2 )
0.02
0.6 0.4
30
20 I2
10 0 0
10
20 I1
30
40
0.2 0
0
0 −0.02
30
20 I2
10
00
10
20 I1
30
40
0 30
20 I2
10
0 0
10
20 I1
1 0.8
0.02 −0.02 40
15000
NMSE(k) (in dB) Re{β(k)}, Im{β(k)}
−0.06 40
5000 10000 Iteration, k
α(0) = 0.8
0 −20 −40 −60
α(0) = 0
0 −20 −40 −60
0
5000 10000 Iteration, k
15000
Figure 10: Convergence of real pole location.
−0.04
30
40
Figure 9: Nonlinear satellite channel second order Volterra kernel (top trace), its estimate using fixed complex pole β1 = 0.7 ± 0.4j, β2 = 0.7 ± 0.1j (middle trace), and the parameter error (bottom trace).
Im{β(k)}
h2 (I1 , I2 )
0.02
Parameter error
NMSE(k) (in dB) NMSE(k) (in dB)
1
0.6 0.4 0.2 0
0
0.2
0.4 0.6 0.8 Re{β(k)}
1
1 0.8 0.6 0.4 0.2 0 β(0) = 0.52 + 0.01j
0 −20 −40 −60
0
5000 10000 Iteration, k
15000
Figure 11: Convergence of complex pole location. ˆ = 0.8) for the Laguerre parameter as minimum (when α(0) shown in Table 3 (recall that the value of 0.5 in Table 2 is the best on a grid with spacing of 0.1). Notice that Figure 7 shows these local minima clearly. Also Figure 11 shows convergence curves for the real and imaginary part of the complex pole ˆ β(k) and NMSE averaged over several trails given an initial condition.
adjust the pole parameter(s) in the FPET based on Laguerre and Kautz expansions. We show that the mean-squared error surface with respect to pole parameters is not quadratic and may even have local minima with these adaptive algorithm. We have seen in the examples of error surfaces that a good choice of pole parameters’ initialization is important.
7.
ACKNOWLEDGEMENT
CONCLUSION
We, in this paper, addressed the identification problem of the nonlinear systems and filters which have fading memory using Volterra model structure. We suggest FPET approach in order to reduce the parameter complexity associated with the Volterra model structure. The results demonstrate the usefulness of the proposed idea for the overparametrization problem. Using the multiple poles for the choice of the basis functions enhance the ability of the nonlinear model in describing the nonlinear filters’ dynamics. Within the FPET approach, we also developed an adaptive algorithm using a gradient descent methodology to identify optimal pole locations. The efficacy of the FPET approach has been demonstrated with the identification problem of a simplified model of a digital satellite channel. Simulations demonstrate the effectiveness of the algorithm when used to
This work was partially supported by a Department of Veterans Affairs merit award, by the National Institutes of Health under grant DK-40426, and by Zonguldak Karaelmas University, Zonguldak, Turkey. REFERENCES [1] M. Schetzen, The Volterra and Wiener Theories of Nonlinear Systems, John Wiley & Sons, New York, 1980. [2] S. Boyd and L. O. Chua, “Fading memory and the problem of approximating nonlinear operators with Volterra series,” IEEE Trans. Circuits and Systems, vol. CAS-32, no. 11, pp. 1150–1171, 1985. [3] R. Hacıo˘glu and G. A. Williamson, “Volterra based identification of nonlinear systems using fixed pole approach,” in IEEEEURASIP Workshop on Nonlinear Signal and Image Processing, 2001.
Reduced complexity volterra models for nonlinear systems and filters [4] G. A. Williamson and S. Zimmerman, “Global convergent adaptive IIR filters based on fixed pole locations,” IEEE Trans. Signal Processing, vol. 44, no. 6, pp. 1418–1427, 1996. [5] N. Wiener, Nonlinear Problems in Random Theory, Technology Press Research Monographs, New York, 1958. [6] V. Z. Marmarelis, “Identification of nonlinear biological systems using Laguerre expansions of kernels,” Annals of Biomed. Eng., vol. 21, pp. 573–589, 1993. [7] V. J. Mathews, “Adaptive polynomial filters,” IEEE Signal Processing Magazine, pp. 10–26, 1991. [8] M. J. Korenberg and L. D. Paarmann, “Orthogonal approaches to time-series analysis and system identification,” IEEE Signal Processing Magazine, vol. 8, no. 3, pp. 29–43, 1991. [9] T. M. Panicker, V. J. Mathews, and G. L. Sicuranza, “Adaptive parallel-cascade truncated Volterra filters,” IEEE Trans. Signal Processing, vol. 46, no. 10, pp. 2664–2673, 1998. [10] B. Wahlberg, “System identification using Laguerre models,” IEEE Trans. on Automatic Control, vol. 36, no. 5, pp. 551–562, 1991. [11] B. Wahlberg, “System identification using Kautz models,” IEEE Trans. on Automatic Control, vol. 39, no. 6, pp. 1276–1281, 1994. [12] S. Haykin, Adaptive Filter Theory, Prentice-Hall, New Jersey, 3rd edition, 1996. [13] G. A. Williamson, C. R. Johnson Jr., and B. D. O. Anderson, “Locally robust identification of linear systems containing unknown gain elements with application to adapted IIR lattice models,” Automatica, vol. 27, pp. 783–798, 1991. [14] M. Nayeri, H. Fan, and W. K. Jenkins Jr., “Some characteristics of error surfaces for insufficient order adaptive IIR filters,” IEEE Trans. Acoustics, Speech, and Signal Processing, vol. 38, no. 7, pp. 1222–1227, 1990. [15] J. Lee and V. J. Mathews, “A fast recursive least squares adaptive second-order Volterra filter and its performance analysis,” IEEE Trans. Signal Processing, vol. 41, no. 3, pp. 1087–1102, 1993. [16] S. Benedetto and E. Biglieri, “Nonlinear equalization of digital satellite channels,” IEEE Journal on Selected Areas in Communications, vol. SAC-1, no. 1, pp. 57–62, 1983. [17] A. A. M. Saleh, “Frequency-independent and frequencydependent nonlinear models of TWT amplifiers,” IEEE Trans. Communications, vol. 29, no. 11, pp. 1715–1720, 1981.
265
Rıfat Hacıo˘glu received the B.S. degree in Elect. & Electronics Eng. from Dokuz Eylul University, Izmir, Turkey in 1993, and the M.Sc. degree in Electrical Engineering from Illinois Institute of Technology, Chicago, IL in 1996. He is a Ph.D. candidate in the area of signal processing and control systems. His main research interests are related to the identification problems of nonlinear systems based on Volterra model structure. He is a student member of IEEE. Geoffrey A. Williamson was born in Kansas City, MO (USA) in 1961. He received the B.S. (with distinction), M.S. and Ph.D. degrees in electrical engineering from Cornell University (Ithaca, NY, USA). Following receipt of the Ph.D. in 1989 he joined the Department of Electrical and Computer Engineering at the Illinois Institute of Technology (Chicago, IL, USA), where he is currently an Associate Professor. From August 1997 to July 1999 he was Associate Dean of the Graduate College at IIT. His research interests concern the application of system identification and adaptive filtering techniques in communication and biological systems. Dr Williamson was the recipient of a GenRad Foundation Fellowship (1983–1987) and was twice the recipient of an IIT Graduate College Fellowship (1998–1999 and 1999–2000). He served as an Associate Editor for the IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing from 1993 to 1995. He is a member of Tau Beta Pi and Eta Kappa Nu.
Photographȱ©ȱTurismeȱdeȱBarcelonaȱ/ȱJ.ȱTrullàs
Preliminaryȱcallȱforȱpapers
OrganizingȱCommittee
The 2011 European Signal Processing Conference (EUSIPCOȬ2011) is the nineteenth in a series of conferences promoted by the European Association for Signal Processing (EURASIP, www.eurasip.org). This year edition will take place in Barcelona, capital city of Catalonia (Spain), and will be jointly organized by the Centre Tecnològic de Telecomunicacions de Catalunya (CTTC) and the Universitat Politècnica de Catalunya (UPC). EUSIPCOȬ2011 will focus on key aspects of signal processing theory and applications li ti as listed li t d below. b l A Acceptance t off submissions b i i will ill be b based b d on quality, lit relevance and originality. Accepted papers will be published in the EUSIPCO proceedings and presented during the conference. Paper submissions, proposals for tutorials and proposals for special sessions are invited in, but not limited to, the following areas of interest.
Areas of Interest • Audio and electroȬacoustics. • Design, implementation, and applications of signal processing systems. • Multimedia l d signall processing and d coding. d • Image and multidimensional signal processing. • Signal detection and estimation. • Sensor array and multiȬchannel signal processing. • Sensor fusion in networked systems. • Signal processing for communications. • Medical imaging and image analysis. • NonȬstationary, nonȬlinear and nonȬGaussian signal processing.
Submissions Procedures to submit a paper and proposals for special sessions and tutorials will be detailed at www.eusipco2011.org. Submitted papers must be cameraȬready, no more than 5 pages long, and conforming to the standard specified on the EUSIPCO 2011 web site. First authors who are registered students can participate in the best student paper competition.
ImportantȱDeadlines: P Proposalsȱforȱspecialȱsessionsȱ l f i l i
15 D 2010 15ȱDecȱ2010
Proposalsȱforȱtutorials
18ȱFeb 2011
Electronicȱsubmissionȱofȱfullȱpapers
21ȱFeb 2011
Notificationȱofȱacceptance SubmissionȱofȱcameraȬreadyȱpapers Webpage:ȱwww.eusipco2011.org
23ȱMay 2011 6ȱJun 2011
HonoraryȱChair MiguelȱA.ȱLagunasȱ(CTTC) GeneralȱChair AnaȱI.ȱPérezȬNeiraȱ(UPC) GeneralȱViceȬChair CarlesȱAntónȬHaroȱ(CTTC) TechnicalȱProgramȱChair XavierȱMestreȱ(CTTC) TechnicalȱProgramȱCo Technical Program CoȬChairs Chairs JavierȱHernandoȱ(UPC) MontserratȱPardàsȱ(UPC) PlenaryȱTalks FerranȱMarquésȱ(UPC) YoninaȱEldarȱ(Technion) SpecialȱSessions IgnacioȱSantamaríaȱ(Unversidadȱ deȱCantabria) MatsȱBengtssonȱ(KTH) Finances MontserratȱNájarȱ(UPC) Montserrat Nájar (UPC) Tutorials DanielȱP.ȱPalomarȱ (HongȱKongȱUST) BeatriceȱPesquetȬPopescuȱ(ENST) Publicityȱ StephanȱPfletschingerȱ(CTTC) MònicaȱNavarroȱ(CTTC) Publications AntonioȱPascualȱ(UPC) CarlesȱFernándezȱ(CTTC) IIndustrialȱLiaisonȱ&ȱExhibits d i l Li i & E hibi AngelikiȱAlexiouȱȱ (UniversityȱofȱPiraeus) AlbertȱSitjàȱ(CTTC) InternationalȱLiaison JuȱLiuȱ(ShandongȱUniversityȬChina) JinhongȱYuanȱ(UNSWȬAustralia) TamasȱSziranyiȱ(SZTAKIȱȬHungary) RichȱSternȱ(CMUȬUSA) RicardoȱL.ȱdeȱQueirozȱȱ(UNBȬBrazil)