International Journal of Modelling and Simulation, Vol. 22, No. 3, 2002
CUMULANTS AND GENETIC ALGORITHM FOR PARAMETERS ESTIMATION OF NONCAUSAL AUTOREGRESSIVE MODELS S.A. Alshebeili,∗ M.A. Alsehaili,∗ and M.A. Alkanhal∗
are several cases of practical interest where the impulse response of the underlying system is noncausal. For example, impulse responses occur often in spatial signal processing problems such as blurring distortion of images [2], astronomical data processing [3], and geophysical signal processing [4]. The noncausal impulse response in these applications represents the distortion introduced by the sensor or the channel. It has also been found convenient to model causal nonminimum phase systems via noncausal systems in certain applications. For example, noncausal AR models have been used in [5] to model nonminimum phase seismic signals leading to a finite-order moving average (MA) inverse. A similar approach has been taken in [6] for blind equalization of communications signals. Different methodologies for determining the coefficients of noncausal AR models from only output cumulants have been reported in the literature [7–13]. These methods include linear and nonlinear least-squares (LS) solutions. In particular, Tugnait [11] proposed an optimization method by minimizing a cost function of the squared error between theoretical output correlations and cumulants and the corresponding sample correlations and sample cumulants. The method of [11] is not based on a direct relationship between the AR process parameters and its second- and higherorder statistics. Furthermore, the solution for the unknown parameters was obtained using a nonlinear optimization technique that may converge to a local minimum. Chen et al. [12] proposed a method to identify AR models using second- and third-order cumulants. It was shown in [12] that the AR parameters are directly related to the eigenvectors of a matrix defined in terms of second- and third-order cumulants. The method of [12] has the disadvantages of being sensitive to additive noise and not applicable for certain choices of cumulant lags. In [13], Chi et al. proposed a new parameter estimation method for a causal or noncausal AR system based on a new quadratic equation relating the unknown AR parameters to higher order than two cumulants of data. The proposed method in [13] finds the optimal estimate through an iterative numerical optimization algorithm. However, it is worth noting that when the AR parameters are estimated through cumulants of order greater than two, the resulting estimates are of higher variance than that of estimates obtained through
Abstract The authors introduce a new method for estimating the coefficients of a noncausal autoregressive (AR) model. This method is based on a new formulation that relates the unknown AR parameters to both second- and third-order cumulants. The new formulation facilitates the use of linear and nonlinear least-square estimation techniques, and includes some published works as a special case. The nonlinear least-square estimation techniques presented in this work make use of a genetic algorithm (GA) to minimize a cost function that is defined in terms of the model’s output cumulants. We also introduce a new method for estimating the coefficients of a noncausal AR model using the power spectrum and a one-dimensional (1-D) slice of the bispectrum. To illustrate the effectiveness of the proposed AR modelling approaches, extensive simulation examples are presented.
Key Words Autoregressive models, higher-order statistics, cumulants, genetic algorithms, parameter estimation
1. Introduction Autoregressive (AR) modelling of non-Gaussian processes is a well-defined problem in various science and engineering areas such as spectral estimation, speech processing, seismology, sonar, radar, radio astronomy, biomedicine, image processing, vibration analysis, and oceanography [1]. In such applications, the observed signal is often modelled as having been generated by passing white noise through an all-pole filter (i.e., AR model). The problem is then to identify the filter parameters from the statistics of a sample function of the observations over a finite number of consecutive instants. Most of the standard algorithms developed for parameter estimation and system identification apply to causal system representation of stochastic signals. However, there ∗
Department of Electrical Engineering, King Saud University, P.O. Box 800, Riyadh 11421, Saudi Arabia; e-mail:
[email protected] (paper no. 205-2049)
1
second-order correlations. On the other hand, when the AR parameters are estimated through second-order correlations, the resulting estimates are of higher bias than that of estimates obtained through higher-order cumulants. In this work we will address the problem of AR modelling of non-Gaussian processes by making use of both second- and higher-order cumulants in an attempt to improve the accuracy of estimated parameters. Specifically, the contribution and organization of this article are as follows. Section 2 presents the definition and properties of higher-order statistics (moments and cumulants) and cumulant spectra of stationary random processes. In Section 3 we develop parametric approaches for the identification of causal and noncausal AR models using only output cumulants. First, we develop a new equation relating the unknown AR parameters to both second- and higher-order cumulants and show that the method presented includes the work of Chi et al. [13] as a special case. Second, we use the new relationship to form a set of linear equations that can be solved for the unknown model parameters. We also address the problem of parameters estimation of an AR model using slices of higher-order spectra. In Section 4 we introduce a new approach for estimating the coefficients of AR model by minimizing nonlinear cost functions defined in terms of a model’s output cumulants. Because of its simplistic implementation and effectiveness in approaching the global minimum of a given multimodal error surfaces, the genetic algorithm (GA) is introduced in this section and used as an optimizer for the defined nonlinear cost functions. Section 5 presents simulation examples to illustrate the effectiveness of the proposed AR modelling approaches. Concluding remarks are given in Section 6.
cumulant, on the other hand, is equal to the fourthorder moment of underlying process less the fourth-order cumulant of a zero-mean Gaussian random process with the same autocorrelation sequence. Polyspectra, or higher-order spectra, are usually defined in terms of higher-order cumulants. The (k + 1)storder spectrum Ck+1 (ω1 , ω2 , . . . , ωk ) of the process y(n) is defined as the Fourier transform of its (k + 1)st-order cumulant ck+1 (τ1 , τ2 , . . . , τk ), that is: Ck+1 (ω1 , ω2 , . . . , ωk ) ∞ ∞ ∞ ... ck+1 (τ1 , τ2 , . . . , τk ) = τ1 =−∞ τ2 =−∞
× exp{−j(ω1 τ1 + ω2 τ2 + · · · + ωk τk )}
3. Linear Identification Approaches In this section, we introduce new linear approaches to the identification of noncausal AR models given by:
H(z) =
h(i)z −i
(8)
For AR models, H(z) takes the form:
where E{·} denotes the mathematical expectation operation. Similarly, the (k + 1)st-order cumulants of y(n) are k-dimensional functions. Cumulants up to order four are of practical interest and for a zero-mean stationary random process y(n) are given by [7]:
H(z) =
1 ˆ B(z)
= q +
1
i=−q −
= q −
i=1 (1
ˆb(i)z −i
− αi z)
1 q +
i=1 (1
− γi z −1 )
(9)
for some positive integers, q − and q + . In the expression − of H(z), we assume |αi | < 1 and |γi | < 1, so that Πqi=1 (1 − αi z) is the maximum phase factor with anticausal impulse + response, and Πqi=1 (1 − γi z −1 ) is the minimum phase factor with causal impulse response. In the time domain, (9) is equivalent to:
(4)
c4 (τ1 , τ2 , τ3 ) = E{y(n)y(n + τ1 )y(n + τ2 )y(n + τ3 )} − c2 (τ1 )c2 (τ2 − τ3 ) − c2 (τ2 )c2 (τ3 − τ1 ) − c2 (τ3 )c2 (τ1 − τ2 )
∞ i=−∞
(1)
c3 (τ1 , τ2 ) = E{y(n)y(n + τ1 )y(n + τ2 )}
(7)
where y(n) is the observed signal, h(n) is the impulse response of the system generating y(n), and x(n) is the driving sequence, which is assumed to be a non-Gaussian, independent and identically distributed (i.i.d.), random process with E{x(n)} = 0, E{x(n)x(n + τ )} = σx2 δ(τ ), and E{x(n)x(n + τ1 )x(n + τ2 )} = βx δ(τ1 , τ2 ). The terms δ(τ ) and δ(τ1 , τ2 ) denote the 1-D and 2-D Kronecker delta functions, respectively. Let H(z) be the z-transform of the system impulse response h(n). That is:
∆
(3)
h(n − i)x(i)
i=−∞
mk+1 (τ1 , τ2 , . . . , τk ) = E{y(n)y(n + τ1 ) · · · y(n + τk )}
c2 (τ ) = E{y(n)y(n + τ )}
∞
y(n) =
If y(n) is a real stationary random process, then its moment of order k + 1 is given by:
(2)
(6)
In general, Ck+1 (ω1 , ω2 , . . . , ωk ) is complex, and a sufficient condition for its existence is that ck+1 (τ1 , τ2 , . . . , τk ) is absolutely summable.
2. Higher-Order Statistics
c1 = E{y(n)} = 0
τk =−∞
(5)
+
q
Note that the first-, second-, and third-order cumulants of zero-mean processes are identical with the first-, second-, and third-order moments, respectively. The fourth-order
i=−q −
2
ˆb(i)y(n − i) = x(n)
(10)
Our objective in this article is to develop methods for recovering the coefficients {ˆb(i)} from observing y(n) when x(n) is not accessible.
be considered as a special case of (17). Specifically, it has been shown in [13] that: +
3.1 A Basic Relationship
∞
q
i=−q −
j=−q −
+
q
+
q
ˆb(i)ˆb(j)c3 (τ1 − i, −j)
+
=ε
q
ˆb(i)c2 (τ1 − i)δ(τ1 )
i=−q −
+
(11)
q
c3 (τ1 , τ2 ) = βx
h(n)h(n + τ1 )h(n + τ2 )
=ε
(12)
n=−∞
σx2 H(ω)H(−ω)
(13)
C3 (ω1 , ω2 ) = βx H(ω1 )H(ω2 )H(−ω1 − ω2 )
(14)
C2 (ω) =
=ε
h(k)
+
q
ˆb(i)h(τ1 − i + k)δ(τ1 )
i=−q −
h(k)δ(τ1 + k)δ(τ1 )
k=−∞
= εh(0)δ(τ1 )
(19)
which is exactly the same result as that of [13].
∆
where H(ω) = H(z)|z=ejω , C2 (ω) and C3 (ω1 , ω2 ) are the power spectrum and bispectrum of the observed sequence y(n). By substituting ω = ω1 + ω2 in (13), we obtain: C2 (ω1 + ω2 ) = σx2 H(ω1 + ω2 )H(−ω1 − ω2 )
∞
∞
h(k)h(τ1 − i + k) δ(τ1 )
k=−∞
k=−∞
where σx2 and βx are the variance and skewness of the input sequence x(n), respectively. Taking the Fourier transform of both sides of (11) and (12), the following expressions result:
∞
ˆb(i)
i=−q −
n=−∞ ∞
(18)
i=−q − j=−q −
=ε h(n)h(n + τ )
ˆb(i)ˆb(j)c3 (τ1 − i, −j) = εh(0)δ(τ1 )
Therefore, by setting τ2 = 0 in (17), we obtain:
In this section, we derive a new relationship relating the unknown AR parameters {ˆb(i)} to both second- and third-order cumulants of output measurements. This relationship is important because it serves as the basis for our identification algorithms. Let us assume that we are given the second- and thirdorder cumulants of the output sequence (in practice they can be estimated). For the linear system h(n) driven by x(n), the second- and third-order cumulants of the output sequence y(n) are given by [8]: c2 (τ ) = σx2
+
q
3.2 Least-Square Method – I In this section we develop the LS method for identifying the model parameters {ˆb(i)} from the second- and thirdorder cumulants of the observed sequence. With output data only, it is possible to identify H(z) within time-shift and scale ambiguities. Thus, we may assume without loss of generality that:
(15)
ˆ From (14), (15), and the fact that H(ω) = 1/B(ω), we can write: ˆ 1 )B(ω ˆ 2 ) = εC2 (ω1 + ω2 )B(ω ˆ 1 + ω2 ) (16) C3 (ω1 , ω2 )B(ω
B(z) = ε = βx /σx2 .
where Taking the inverse Fourier transform of both sides of (16), we obtain the relationship:
q
b(i)z −i
(20)
i=0
where: +
+
q
q
i=−q −
j=−q −
ˆb(i)ˆb(j)c3 (τ1 − i, τ2 − j)
b(i) = ˆb(i − q − ) b(0) = 1
+
=ε
q
ˆb(i)c2 (τ1 − i)δ(τ1 − τ2 )
q = q− + q+
(17)
i=−q −
The above equation has novel features in that it relates the AR parameters to the second- and third-order cumulants of the observed sequence, and leads to new approaches to identification of AR models using slices of higher-order spectra. Furthermore, the basic equation used in [13] can
Based on (17), we can form the following system of linear equations: d = Mr 3
(21)
where: r = (ε εb(1) · · · εb(q) 2
matrix U, that is, u1 , u2 , . . . , u1+q , are the right singular vectors of R [14]. As R is of rank one, it follows that there is only one nonzero singular value, whose corresponding singular vectors determine the unknown vector coefficient. From the properties of the SVD, it can be shown that:
b(1) b(2) · · · b(q) 2
b (1) b(1)b(2) · · · b (q))—(q 2 + 5q + 2)/2 column vector d = (c3 (−p, −p)
c3 (−p, −p + 1) · · · c3 (−p, p)
b(n) = z1 (0)V (0, 0)u1 (n)
2
c3 (−p + 1, −p) · · · c3 (p, p))—(4p + 4p + 1)
bε (n) = u1 (0)V (0, 0)z1 (n)
column vector
T
−1
MT d
(22)
The unknown coefficients b(1), b(2), . . . , b(q) can then be directly determined from elements of the vector r. This would be the end of the matter when there are no measurement noise and estimation errors. If this is not the case, we propose an alternative approach, which exploits all the available information provided by the vector r. First, we form the matrix R from the vector r as follows: ε εb(1) εb(2) ··· εb(q) 1 b(1) b(2) ··· b(q) b(1) 2 b (1) b(1)b(2) · · · b(1)b(q) (23) R= b(2) b(2)b(1) 2 b (2) · · · b(2)b(q) . .. .. .. .. .. . . . . b(q) ··· ··· ··· b2 (q)
0≤n≤q+1
(27)
3.3 Least-Square Method – II In this section, we present another possible approach to identifying the AR parameters {b(i)} from the cumulants of the output sequence y(n). In this approach, we use the LS method for identifying {b(i)} from the power spectrum and a 1-D slice of the bispectrum. The method utilizes the 1-D slice corresponding to ω1 = ω2 . This slice has been previously used in [15–18] for blind identification of nonminimum phase systems. By substituting ω1 = ω2 = ω in (16), the following relationship results: C3 (ω, ω)B 2 (ω) = εC2 (2ω)B(2ω)
(28)
In the time domain, (28) takes the form: 2q
It is clear from the structure of R that its rank is one. R can always be written in the following form: ε 1 R = bε bT = b(1) (1b(1)b(2) · · · b(q)) (24) .. . b(q)
b1 (i)c (τ − i) = ε
i=0
=
2q
b2 (i)s (τ − i)
i=0
q τ ε −i b(i)c2 2 0
i=0
if
τ = integer 2
(29)
otherwise
where c (i), b1 (i), s (i), and b2 (i) are the inverse Fourier transforms of C3 (ω, ω), B 2 (ω), C2 (2ω), and B(2ω), respectively. The expression (29) relates the AR parameters to the cumulants of the output sequence y(n). Generally, (29) can be written in a matrix form as follows:
The unknown AR parameters {b(i)} can now be identified from the matrix R using one of the elegant techniques in numerical algebra, the singular value decomposition (SVD). That is: R = ZVUT
(26)
It is relevant at this point to mention that theoretically only one singular value of R is nonzero. In practice, due to noise and estimation errors, there may be many nonzero singular values, but only a single dominant one. In fact, when the number of realizations approaches infinity, this dominance will be more pronounced. In such a case, we keep the dominant singular value and its corresponding singular vectors and use them in the determination of b(n) and bε (n) using (26) and (27), respectively.
M is a matrix of size (4p2 + 4p + 1) × (q 2 + 5q + 2)/2 whose entries are determined according to (17). The parameter p denotes the maximum lag (i.e., values of τ1 and τ2 ) of cumulant sequence. The LS solution of this over determined system of equations is given by: r = (M M)
0≤n≤q
d = Dr
(30)
where:
(25)
d = (c (−2p) c (−2p + 2) · · · c (2p)
where V is a diagonal matrix, the diagonal elements of which are the singular values of R. The columns of the unitary matrix Z, that is, z1 , z2 , . . . , z2+q , are the left singular vectors of R, and the columns of the second unitary
0 · · · 0)
—4p + 2q + 1 column vector r = (ε εb(1) · · · εb(q)
b1 (1) b1 (2) b1 (3) · · · b1 (2q))
—3q + 1 column vector 4
and D is a (4p + 2q + 1) × (3q + 1) matrix whose entries are determined according to (29) as follows: 0 ··· 0 0 0 ··· 0 −c2 (−p) 0 0 · · · 0 c (−2p) 0 · · · 0 −c2 (−p + 1) 0 ··· 0 c (−2p + 1) c (−2p) · · · 0 .. .. .. .. .. .. . . ··· . . . ··· . D = −c2 (p) −c2 (p − 2) · · · −c2 (p − 2q) c (2p − 1) c (2p − 2) · · · c (2p − 2q) 0 0 ··· 0 c (2p) c (2p − 1) · · · c (2p − 2q + 1) .. .. .. .. .. . . ··· . . . ··· ··· 0 0 ··· 0 0 c (2p) ··· c (2p − 1) 0 0 ··· −c2 (p) 0 0 ··· c (2p)
4. Nonlinear Identification Approaches
The LS solution of this overdetermined system of equations is given by: −1
T
r = (D D)
4.1 Preliminaries T
D d
(32) The methods presented in the previous section require solution of a system of linear equations. Identification methods that exploit all the relevant statistics and use nonlinear optimization techniques are also of interest. Practically speaking, estimates obtained via nonlinear solutions are generally better than the estimates obtained via linear solutions provided that the first are properly initialized. In this section, we introduce a new approach to estimating the coefficients of a noncausal AR models, and that is by minimizing a nonlinear cost function defined in terms of a model’s output cumulants. Let us assume that we are given the second- and thirdorder cumulants of the output sequence. Based on (17), our basic method is to choose the coefficients {b(i)}qi=1 to minimize the following cost function: q q J= b(i)b(j)c3 (τ1 − i, τ2 − j)
Once we obtain r, the sequence b(i) can be recovered from the first q + 1 elements of r. That is, b(i) = r(i)/r(0) for 1 ≤ i ≤ q. An important property of the LS method of this section is that it can be extended in a straightforward manner to make use of the (k + 1)st-order spectrum. If Ck+1 (ω, . . . , ω) is the 1-D slice of the (k + 1)st-order spectrum, G(ω) will be related in this case to the system frequency response H(ω) by: G(ω) =
εH k (ω) H(kω)
(33)
where G(ω) = Ck+1 (ω, . . . , ω)/C2 (kω), and C2 (kω) is the Fourier transform of the sequence:
s (τ ) =
c2 (τ /k)
if
0
otherwise
τ k
= integer
τ1
(34)
−ε
i=0
b1 (i)c (τ − i) =
kq
εb2 (i)s (τ − i)
τ2 q
i=0 j=0
2 b(i)c2 (τ1 − i)δ(τ1 − τ2 )
(36)
i=0
In the time domain, (33) takes the form: kq
(31)
Also, in this section we use the same nonlinear method for estimating the AR coefficients using the power spectrum and 1-D slice of the bispectrum. Based on (29), we can define the following cost function:
(35)
i=0
where: Jd =
b1 (n) = F −1 [H(kω)],
2q τ
b2 (n) = F −1 [H k (ω)],
i=0
b1 (i)c (τ − i) − ε
2q
2
b2 (i)s (τ − i)
(37)
i=0
Traditionally, there are two major classes of optimization algorithms used, classified as calculus-based techniques and the enumerative techniques. Calculus-based optimization techniques employ the gradient-directed searching mechanism to solve the error surface or differentiable surface of an objective function [19]. However, for ill-defined or multimodal objective functions, local optima
c (τ ) = F −1 [Ck+1 (ω, . . . , ω)] Equation (35) resembles (29). Therefore, it can be solved in a way similar to what we have done before using LS methods. 5
are frequently obtained. In signal processing, objective functions in this category are common as the signal can be noisy, fuzzy, and discontinuous [20]. In 1975 Holland introduced another optimization technique that is very different from traditional techniques [21]. This technique is a stochastic global optimization method that mimics the metaphor of natural biological evolution and is known as the GA. This algorithm is similar to its associated algorithms: simulated annealing [22], evolutionary strategies [23], and evolutionary programming [24], which are classified as guided random techniques. In what follows, we use GA as an optimizer for the previously defined cost functions. There are three main motivations behind the use of GA in this signal processing setting. First, GA has produced excellent results when applied to a closely related parameter estimation problem [25]. Second, GA is simple to modify, enhance, and hybridize with other mathematical optimization schemes. Third, GA has a parallel nature that makes it amenable to implementation using modern parallel architectures [26–29].
Figure 1. The hierarchy of the genetic algorithm operators. The unknown model parameters are encoded using 10 bits. In our algorithms, the number of chromosome Nc is equal to 10q, where q is the number of unknown coefficients and parents are selected by rank, that is, chromosomes that have the best estimation fitness are automatically chosen to mate. Passing the best chromosomes unchanged into the next generation guarantees that the minimum estimation error is monotonically decreasing with generations evolution. Further, the selected parents are mated at the gene level according to a probability weight. It has been found through computer simulations that this scheme has the capability of reaching the global solution faster than methods based on mating at the chromosomes level.
4.2 Genetic Algorithm Basically, the GA starts with a population, which consists of a collection of chromosomes that represent the candidate solutions to the optimization problem. Each of the individual solutions is evaluated for its fitness in solving the given minimization/maximization task. At each generation, the most fit chromosomes are allowed to mate and bear off-spring. These children (new chromosomesparameters) then form the basis for the next generation. The main operators of the proposed GA can be summarized as follows:
5. Computer Simulations In this simulation, the observed sequence z(n) is generated by the signal model:
1. Randomly generate an initial population of Nc chromosomes, each with d genes {g1 , g2 , . . . , gd } as binary strings. 2. Decode the chromosomes and evaluate the fitness of each chromosome in the current population. 3. Select a pair of parents chromosomes and pass the selected parents into the next population. 4. Crossover the selected parents at a randomly chosen point to form new chromosomes (offspring). 5. Replace the current population with the new population. 6. Mutate the new population with probability pm (mutation rate). 7. Calculate the fitness of each chromosome in the new population. 8. If optimization criteria are met, stop and return the best chromosome, or else repeat steps 3 to 7.
y(n) = −
q
b(i)y(n − i) + x(n)
i=1
z(n) = y(n) + v(n)
(38)
where the driving input signal x(n) is a zero-mean exponentially distributed i.i.d. random sequence of variance σx2 = 1 and skewness βx = 2. For the selected noncausal AR model, the sequence y(n) is generated by passing x(n) through the causal stable part followed by the anticausal stable part of the AR model. The signal v(n) is a zero-mean white Gaussian process that is added to the model output y(n) to generate the noisy data z(n) at different levels of SNR, where SNR = E{y 2 (n)}/E{v 2 (n)}. Five different lengths of output data z(n) have been used in this study: N = 256, 512, 1,024, 2,048, and 4,096. The model coefficients have been computed for 100 realizations of the observed data corresponding to different seed values at SNR = ∞ (noise-free case). Aside from the noise-free case, the unknown coefficients have been also computed for N = 2,048 at SNR = 100, 20, and 10 dB. The second- and third-order cumulants of z(n) have been computed by first dividing the sequence z(n) into K records of M samples each. For each record, we compute the
Fig. 1 illustrates the hierarchy of the algorithm operators. 4.3 Optimization Algorithm The modelling algorithm that we propose makes use of GA to minimize the cost functions defined in (36) and (37). 6
Figure 2. Pole-zero plots of transfer functions of AR models. second- and third-order cumulants using: 1 z(n) µ= M n 1 (z(n) − µ)(z(n + τ ) − µ) c2 (τ ) = M n 1 (z(n) − µ)(z(n + τ1 ) − µ) c3 (τ1 , τ2 ) = M n × (z(n + τ2 ) − µ)
Fig. 2. Considering the identified coefficients as random variables, the mean and standard deviation of these coefficients have been computed over the 100 realizations and tabulated in Tables 1–3. True and estimated parameters (mean ± standard deviation) obtained using the LS method of Section 3.2 (denoted by LMS-I) at SNR = ∞ and different lengths of output data are summarized in Table 1. Observe that the larger data length improves the performance of the algorithm. In fact, increasing data length N reduces the discrepancy noted between the true and estimated statistics utilized in the LMS-I algorithm, which in turn leads to higher accuracy for the resulting model parameters. On the other hand, the performance slightly degrades with increases in the model order q, and it has been found to be sensitive to model-order mismatch. Methods for determining the true model order of an AR process are available in the literature (see, e.g., [9], [30], and the references therein). Table 2 shows the results obtained by LMS-I algorithm with data length N = 2,048 and different values of SNR. We note from the table that the standard deviation of unknown model parameters increases as SNR decreases. This is intuitively not surprising because the observed signal z(n) is a noise-corrupted version of the original model output y(n). For Gaussian processes only, cumulants
(39)
0≤n≤M −1 and then we average over all the K records. In all the simulations presented, we have chosen M = 128. Three noncausal AR models with different orders have been considered. These models are described by the following system transfer functions: H1 (z) =
1 z 2 + 0.5556z + 1.1110
(40)
H2 (z) =
1 z − 2.0498 + 1.6499z −1 − 0.8125z −2
(41)
H3 (z) =
1 z 3 − 1.7668z 2 + 2.5336z − 3.0502 + 1.6668z −1 − 0.8334z −2 (42)
The pole-zero plots of model transfer functions {Hi (z)}3i=1 and their corresponding impulse responses are shown in 7
Table 1 Estimated AR Coefficients of Simulation Example Using LMS-I at SNR = ∞ (µ = Mean, σ = Standard Deviation) True Value
Model Order = 2 N = 256 µ
0.5556 1.1110
N = 512 σ
0.5308 0.9903
µ
0.0439 0.0825
0.5372 1.0169
σ 0.0345 0.0609
N = 1,024
N = 2,048
µ
µ
σ
0.5447 1.0593
0.0296 0.0431
N = 4,096 σ
µ
σ
0.5525 1.0955
0.0178 0.0345
0.5548 1.1129
0.0136 0.0221
−1.9800 1.5466 −0.7431
0.0946 0.1259 0.0852
−2.0124 1.5965 −0.7805
0.0582 0.0659 0.0462
−1.5693 2.0067 −1.6646 0.9962 −0.35451
0.2525 0.4822 0.5937 0.4834 0.2924
−1.7129 2.2737 −1.9755 1.2166 −0.3567
0.1774 0.3422 0.4448 0.3743 0.2299
Model Order = 3 −2.0498 1.6499 −0.8125
−1.6024 1.1782 −0.4703
0.3449 0.3815 0.2488
−1.7363 1.2819 −0.5614
0.2945 0.3259 0.2133
−1.9293 1.4951 −0.6983
0.1646 0.1910 0.1178
Model Order = 5 −1.7668 2.5336 −3.0502 1.6668 −0.8334
−1.4480 1.8181 −1.4715 0.8511 −0.2423
0.3913 0.8471 1.1288 0.9453 0.6302
−1.4769 1.8756 −1.5572 0.9330 −0.2710
0.3477 0.7154 1.0000 0.7872 0.4544
−1.5657 1.9757 −1.6337 0.9808 −0.3291
0.3412 0.7350 1.0052 0.8566 0.5678
Table 2 Estimated AR Coefficients of Simulation Example Using LMS-I at N = 2,048 (µ = Mean, σ = Standard Deviation) True Value
Model Order = 2 SNR = ∞ µ
0.5556 1.1110
0.5525 1.0955
SNR = 100 dB σ
0.0178 0.0345
µ 0.5531 1.0902
σ 0.0187 0.0395
SNR = 20 dB µ 0.5424 1.0541
σ
SNR = 10 dB µ
σ
0.0225 0.0458
0.5243 1.0350
0.0458 0.0715
0.1784 0.1655 0.1175
−1.1587 0.5848 −0.1541
0.3082 0.2685 0.1655
0.2660 0.5776 0.8144 0.6864 0.4309
−1.0789 1.0096 −0.3894 −0.0601 0.3983
0.4215 0.6285 1.3919 1.4112 1.6996
Model Order = 3 −2.0498 1.6499 −0.8125
−1.9800 1.5466 −0.7431
0.0946 0.1259 0.0852
−1.9727 1.5425 −0.7446
0.1326 0.1564 0.0993
−1.8960 1.4480 −0.6765
Model Order = 5 −1.7668 2.5336 −3.0502 1.6668 −0.8334
−1.5693 2.0067 −1.6646 0.9962 −0.2445
0.2525 0.4822 0.5937 0.4834 0.2924
−1.6366 2.1231 −1.8077 1.0935 −0.3097
0.2364 0.4530 0.5843 0.4869 0.2861
−1.5408 1.9060 −1.5019 0.8436 −0.1455
of the additive noise is nonzero only for zero lag, the consistency of the LS solution will remain unaffected if the rows of the matrix M containing the samples c2 (0) are removed. Table 3 shows the results obtained when the LS method of Section 3.3 (denoted by LMS-II) are implemented at different values of SNR and data length N = 2,048. The results indicate that it is possible to identify the true model parameters, from knowledge of the power spectrum and 1-D slice of bispectrum, using the proposed LMS-II
of order greater than two are identically zero. This property can be exploited in estimating the third-order cumulant of noisy observations. Under the assumption that v(n) is independent of y(n), the third-order cumulant of z(n) is equal to that of y(n), which indicates that the thirdorder cumulant is not affected by additive Gaussian noise. The second-order statistics, on the other hand, appear to be affected by presence of noise because the second-order cumulant of z(n) equals the second-order cumulant of y(n) plus that of v(n). However, if the second-order cumulant 8
Table 3 Estimated AR Coefficients of Simulation Example Using LMS-II at N = 2,048 (µ = Mean, σ = Standard Deviation) True Value
Model Order = 2 SNR = ∞ µ
0.5556 1.1110
0.7922 1.2652
SNR = 100 dB σ
0.2138 0.1930
µ 0.8392 1.3394
σ 0.2185 0.2408
SNR = 20 dB µ 0.9038 1.3683
σ
SNR = 10 dB µ
σ
0.2477 0.4136
0.9978 1.3628
0.6638 0.7022
0.8770 0.8111 0.8388
−1.5774 1.1321 −0.4791
1.2767 1.2893 0.8590
1.3884 2.0722 1.3709 1.0858 0.8243
−1.7948 2.6600 −2.1379 1.0052 −0.4260
1.1911 2.4890 1.6246 1.1626 1.0466
Model Order = 3 −2.0498 1.6499 −0.8125
−2.1071 1.6996 −0.8083
0.4051 0.3929 0.2165
−2.1629 1.6244 −0.8146
0.7703 0.7472 0.4030
−2.3306 1.7076 −0.7733
Model Order = 5 −1.7668 2.5336 −3.0502 1.6668 −0.8334
−1.8454 2.9240 −1.8697 0.8745 −0.5343
0.6225 0.6690 0.7040 0.6598 0.3907
−1.9079 3.0609 −1.9919 0.9742 −0.5987
0.9366 1.2188 0.9790 0.7935 0.5224
−2.1476 3.2368 −2.0488 1.0013 −0.5412
Table 4 Estimated AR Coefficients of Simulation Example Using GEN-I at N = 2,048 (µ = Mean, σ = Standard Deviation) True Value
−1.7668 2.5336 −3.0502 1.6668 −0.8334
N = 2,048 SNR = ∞
SNR = 100 dB
SNR = 20 dB
SNR = 10 dB
µ
σ
µ
σ
µ
σ
µ
σ
−1.7453 2.4855 −3.0807 1.6402 −0.8393
0.0545 0.0909 0.0983 0.0983 0.0579
−1.7538 2.4932 −3.0789 1.6350 −0.8223
0.0599 0.0955 0.1424 0.1039 0.0562
−1.7228 2.4487 −3.0782 1.6038 −0.8246
0.0596 0.0949 0.1623 0.0995 0.0663
−1.5681 2.2695 −3.1224 1.4773 −0.7647
0.1165 0.1790 0.3092 0.2490 0.1347
simulation, we have considered the minimization of the cost functions J and Jd given by (36) and (37), and denoted by GEN-I and GEN-II, respectively. For comparison purposes, Chi’s method [13] has also been tested using the same data. When we compare the results displayed in Tables 2 and 3 and those displayed in Tables 4 and 5, the advantage of using the nonlinear methods is evident. That is, GA-based estimation methods, which include Chi’s method as a special case, produce better results (in terms of mean and standard deviation) than their linear counterparts. Furthermore, we note that modelling using both secondand third-order cumulants has an advantage in performance over modelling using third-order cumulant only at relatively high SNR.
algorithm. However, the results obtained via LMS-II algorithm are not as good in terms of mean and standard deviation as the results obtained via LMS-I algorithm. That is to say, LMS-I algorithm outperforms LMS-II algorithm. This is, however, expected as the amount of relevant statistics used in each algorithm is different. For instance, LMS-I algorithm utilizes all slices of bispectrum, whereas LMS-II algorithm utilizes only the diagonal slice. It is relevant to mention that the improvement in the performance of LMS-I algorithm over that of LMS-II algorithm is achieved at the expense of increasing computational complexity. In LMS-I algorithm, we need to compute (4p2 + 4p + 1) cumulant samples and to solve for (q 2 + 5q + 2)/2 unknown parameters. In LMS-II algorithm, we need to compute 4p + 1 cumulant samples of the 1-D statistics c (τ ) and to solve for 3q + 1 unknown parameters. Tables 4–6 show the results obtained for the third model when the GA of Section 4 is implemented at different values of SNR and data length N = 2,048. In this
6. Conclusion In this work we have demonstrated the use of cumulants and their associated Fourier transforms (polyspectra) in 9
Table 5 Estimated AR Coefficients of Simulation Example Using GEN-II at N = 2,048 (µ = Mean, σ = Standard Deviation) True Value
−1.7668 2.5336 −3.0502 1.6668 −0.8334
N = 2,048 SNR = ∞
SNR = 100 dB
SNR = 20 dB
SNR = 10 dB
µ
σ
µ
σ
µ
σ
µ
σ
−1.4388 2.1271 −2.7615 1.4937 −0.7330
0.1621 0.1268 0.2907 0.2551 0.1632
−1.2816 2.3905 −2.5754 1.6107 −0.7407
0.2158 0.3135 0.1500 0.3437 0.1479
−1.1016 2.2334 −2.6868 1.5496 −0.4923
0.3153 0.3882 0.3693 0.4220 0.3990
−1.2001 2.1329 −2.7995 1.4190 −0.4410
0.4190 0.3113 0.4782 0.4617 0.4707
Table 6 Estimated AR Coefficients of Simulation Example Using Chi’s Method at N = 2,048 (µ = Mean, σ = Standard Deviation) True Value
−1.7668 2.5336 −3.0502 1.6668 −0.8334
N = 2,048 SNR = ∞
SNR = 100 dB
SNR = 20 dB
SNR = 10 dB
µ
σ
µ
σ
µ
σ
µ
σ
−1.7413 2.5213 −3.1100 1.6563 −0.8007
0.0681 0.1259 0.1842 0.1275 0.0953
−1.7403 2.5132 −3.1103 1.6295 −0.8005
0.0978 0.1633 0.2259 0.1628 0.1047
−1.7450 2.5040 −3.0957 1.6543 −0.7990
0.0677 0.0977 0.1525 0.1286 0.1003
−1.7098 2.4649 −3.0278 1.5731 −0.7496
0.0753 0.1115 0.1684 0.1265 0.1012
measurements. The parameters are identified evolutionarily by cumulants-matching performed using GA. This algorithm has been designed to minimize a cost function defined in terms of a model’s output cumulants. Results demonstrated that the new method is capable of identifying the parameters of the order predetermined model accurately. In addition, the proposed algorithm is shown to be capable of identifying the parameters of an AR model using output data, which may be corrupted by Gaussian noise.
the modelling of non-Gaussian processes generated by AR models. Specifically, we have developed a new formulation that relates the unknown AR parameters to second- and higher-order cumulants. The new formulation facilitates the use of linear and nonlinear LS estimation techniques, and includes the work of [13] as a special case. Several conclusions can be drawn from the identification methods developed in the previous sections: • A causal or noncausal AR model whose input is an i.i.d. non-Gaussian process can be reconstructed from output data using second- and third-order cumulants. It has been demonstrated by computer simulations that the identification methods based on both secondand third-order cumulants outperform the identification methods based on third-order cumulants only at relatively high SNR. • A causal or noncausal AR model can be identified from output data using only the second- and 1-D slice of bispectrum. Although the use of 1-D slice of bispectrum is generally inferior (in terms of the accuracy of estimates) to the use of all slices of bispectrum, it leads to the development of computationally efficient methods for the identification problem. • GA is a promising optimization technique. This article has presented its application in the development of a new method for estimating the parameters of a causal or noncausal AR model from only its output
Acknowledgement This work was supported by the Research Center, College of Engineering, King Saud University, under grant no. 421/5.
References [1] [2]
[3]
10
S. Kay, Modern spectral estimations: Theory and application (Englewood Cliffs, NJ: Prentice-Hall, 1987). J.W. Woods, Two-dimensional Kalman filtering, in T.S. Huang (Ed.), Two-dimensional digital signal processing I (New York: Springer-Verlag, 1981). J.D. Scargle, Studies in astronomical time series analysis – I: Modeling random processes in the time domain, Astrophysical Journal Supplement Series, 45, 1981, 1–71.
[4]
[5]
[6]
[7] [8]
[9]
[10] [11]
[12]
[13]
[14] [15]
[16]
[17]
[18]
[19] [20]
[21] [22] [23] [24]
[25]
[26]
A.C. Hsueh & J.M. Mendel, Minimum-variance and maximumlikelihood deconvolution for noncausal channel models, IEEE Transactions on Geoscience and Remote Sensing, 23 (11), 1985, 797–808. D. Donoho, On minimum entropy deconvolution, in D.F. Findley (Ed.), Applied time series analysis – II (New York: Academic, 1981). A. Benveniste, M. Goursat, & G. Ruget, Robust identification of a nonminimum phase system: Blind adjustment of linear equalizer in data communications, IEEE Transactions on Automatic Control, 25 (3), 1980, 385–398. C.L. Nikias & A.P. Petropula, Higher-order spectra analysis (New Jersey: Prentice Hall, 1993). C.L. Nikias & M. Raghuveer, Bispectrum estimation: A digital signal processing frame-work, Proceedings on IEEE, 75(7), 1987, 869–891. J.M. Mendel, Tutorial on higher-order statistics (spectra) in signal processing and system theory: Theoretical results and some applications, Proceedings on IEEE, 79 (3), 1991, 277–305. A. Swami, G.B. Giannakis, & G. Zhou, Bibliography on higher-order statistics, Signal Processing, 60 (1), 1997, 65–126. J.K. Tugnait, Fitting noncausal AR signal plus noise models to noisy non-Gaussian linear processes, IEEE Transactions on Automatic Control, 32 (6), 1987, 547–552. L. Chen, H. Kusaka, M. Kominami, & Q. Yin, Blind identification of noncausal AR models based on higher-order statistics, Signal Processing, 48 (1), 1996, 27–36. C.Y. Chi, J.L. Hwang, & C.F. Rau, A new cumulant based parameter method for noncausal autoregressive systems, IEEE Transactions on Signal Processing, 42 (9), 1994, 2524–2527. G.H. Golub & C.F. Vanloan, Matrix computation (London: Johns Hapkins University Press, 1983). S.A. Alshebeili & A.E. Cetin, A phase reconstruction algorithm from bispectrum, IEEE Transactions on Geoscience and Remote Sensing, 28 (2), 1990, 166–170. S.A. Alshebeili, A.E. Cetin, & A.N. Venetsanopoulos, An adaptive system identification method based on bispectrum, IEEE Transactions on Circuits and Systems, 38 (8), 1991, 967–969. S.A. Alshebeili, A.E. Cetin, & A.N. Venetsanopoulos, Identification of nonminimum phase MA systems using cepstral operations on slices of higher order spectra, IEEE Transactions on Circuits and Systems, 39 (9), 1992, 634–637. S.A. Alshebeili, A.N. Venetsanopoulos, & A.E. Cetin, Cumulant-based identification approaches for nonminimum phase FIR systems, IEEE Transactions on Signal Processing, 41 (4), 1993, 1576–1588. P.E. Gill, W. Murray, & M.H. Wright, Practical optimization (Baltimore: Academic Press, 1981). K.S. Tang, K.F. Man, S. Kwong, & Q. He, Genetic algorithms and their applications, IEEE Signal Processing Magazine, 13 (6), 1996, 22–36. H. Holland, Adaptation in natural and artificial systems (Ann Arbor: University of Michigan Press, 1975). S. Kirkpatrick, C.D. Gellat, & M.P. Veecchi, Optimization by simulated annealing, Science, 22 (4598), 1983, 671–680. H.P. Schwefel, Numerical optimization of computer models (Chichester: John Wiley, 1981). D.B. Fogel, System identification through simulated evolution: A machine learning approach to modeling (Needham Heights, MA: Ginn Press, 1991). M.A. Alkanhal & S.A. Alshebeili, Blind identification of nonminimum phase FIR systems: Cumulants matching via genetic algorithms, Signal Processing, 67, 1998, 25–34. K. Twardowski, An associative architecture for genetic algorithm-based machine learning, IEEE Computer, 27 (11), 1994, 27–38.
[27]
[28]
[29]
[30]
A.J. Chipperfield & P.J. Fleming, Parallel genetic algorithms: A survey, ACSE Research Report 518, University of Sheffield, May 1994. H.M. Voight, I. Santibanez-Koref, & J. Born, Hierarchically structured distributed genetic algorithms, in Proc. of the Parallel Problem Solving from Nature Conf., PPSN-II, Brussels, Belgium, September 1992, 157–166. M. Tomassini, Parallel and distributed evolutionary algorithms: A review, in K. Miettinen, M. Makela, P. Neittaanmaki, & J. Periaux (Eds.), Evolutionary algorithms in engineering and computer science (Chichester: John Wiley & Sons, 1999), 113–133. G.B. Glannakis & J.M. Mendel, Cumulant-based order determination of non-Gaussian ARMA models, IEEE Transactions on Acoustic Speech Signal Processing, 38 (8), 1990, 1411–1423.
Biographies Saleh A. Alshebeili received the B.Sc. and M.Sc. degrees from King Saud University, Riyadh, Saudi Arabia, in 1984 and 1986, respectively, and the Ph.D. degree from University of Toronto, Ontario, Canada, in 1992, all in Electrical Engineering. He is presently an Associate Professor and Chairman of the Department of Electrical Engineering at King Saud University, Riyadh, Saudi Arabia. His research interests are in the general area of statistical signal processing with applications to communications, speech, and image processing. Mohammad A. Alsehaili received the B.Sc. and the M.Sc. degrees in Electrical Engineering from King Saud University, Riyadh, Saudi Arabia, in 1992 and 1998, respectively. Currently, he is studying Ph.D. program in Electrical Engineering at University of Manitoba, Winnipeg, Manitoba, Canada. His research interests are in the area of electromagnetic waves and antennas. Majeed A. Alkanhal received the B.Sc. and M.Sc. degrees from King Saud University, Riyadh, Saudi Arabia, in 1984 and 1986, respectively, and the Ph.D. degree from Syracuse University, Syracuse, New York in 1994, all in Electrical Engineering. He is presently an Assistant Professor in the Department of Electrical Engineering at King Saud University, Riyadh, Saudi Arabia. His research interests are in antennas and propagation, computational electromagnetics and modern optimization techniques as applied to electrical engineering and communication systems. He has authored and co-authored several research papers, reports and a book in engineering electromagnetics. He has also worked as a consultant in the field of electronics and communications and has written and participated in approving several Saudi standards.
11