Sign up on lists on webpage from Monday Nov 1. Adaptive Signal Processing
2011. Lecture 1. Course literature. 3. Book: Simon Haykin, Adaptive Filter Theory,
...
Welcome to Adaptive Signal Processing!
1
From Merriam-Webster’s Collegiate Dictionary: Main Entry: ad·ap·ta·tion Pronunciation: “a-”dap-’tA-sh&n, -d&pFunction: noun Date: 1610 1 : the act or process of adapting : the state of being adapted 2 : adjustment to environmental conditions: as a : adjustment of a sense organ to the intensity or quality of stimulation b : modification of an organism or its parts that makes it more fit for existence under the conditions of its environment 3 : something that is adapted; specifically : a composition rewritten into a new form - ad·ap·ta·tion·al /-shn&l, -sh&-n&l/ adjective - ad·ap·ta·tion·al·ly adverb
Adaptive Signal Processing 2011
Lecture 1
Course literature Book:
3
Simon Haykin, Adaptive Filter Theory, 4th edition, Prentice-Hall, 2001. ISBN: 0-13-090126-1 (Hardcover) Kapitel: Backgr., (2), 4, 5, 6, 7, 8, 9, 13.2, 14.1 (3:e edition: Intr., 1, (5), 8, 9, 10, 13, 16.1, 17.2 )
Exercise material:Exercise compendium (course home page)
Lectures and exercises Lectures:
Tuesdays 08.15-10.00 in room E:1406
Exercises:
Wednesdays 08.15-10.00 in room E:1145
2
Computer exercises: Wednesdays 13.15-15.00 in room E:4115, or Thursdays 10.15-12.00 in room E:4115 Laborations:
Lab I: Adaptive channel equalizer in room E:4115 Lab II: Adaptive filter on a DSP in room E:4115 Sign up on lists on webpage from Monday Nov 1.
Adaptive Signal Processing 2011
Lecture 1
Contents - References in the 4:th edition
4
Vecka 1: Repetition of OSB (Hayes, or chap.2), The method of the Steepest descent (chap.4) Vecka 2: The LMS algorithm (chap.5) Vecka 3: Modified LMS-algorithms (chap.6) Vecka 4: Freqency adaptive filters (chap.7) Vecka 5: The RLS algoritm (chap.8–9)
Computer exercises (course home page)
Vecka 6: Tracking and implementation aspects (chap.13.2, 14.1)
Laborations (course home page)
Vecka 7: Summary
Other material: Lecture notes (course home page) Matlab code (course home page) Adaptive Signal Processing 2011
Lecture 1
Adaptive Signal Processing 2011
Lecture 1
Contents - References in the 3:rd edition
5
Lecture 1
6
This lecture deals with
Vecka 1: Repetition of OSB (Hayes, or chap.5), The method of the Steepest descent (chap.8)
• Repetition of the course Optimal signal processing (OSB)
Vecka 2: The LMS algorithm (chap.9)
• The method of the Steepest descent
Vecka 3: Modified LMS-algorithms (chap.9) Vecka 4: Freqency adaptive filters (chap.10, 1) Vecka 5: The RLS algoritm (chap.11) Vecka 6: Tracking and implementation aspects (chap.16.1, 17.2) Vecka 7: Summary
Adaptive Signal Processing 2011
Recap of Optimal signal processing (OSB)
Lecture 1
7
Adaptive Signal Processing 2011
Optimal Linear Filtrering
The following problems were treated in OSB
• Signal modeling Either a model with both poles and zeros or a model with only poles (vocal tract) or only zeros (lips). • Invers filter of FIR type Deconvolution or equalization of a channel. • Wiener filter Filtrering, equalization, prediction och deconvolution.
Adaptive Signal Processing 2011
Lecture 1
Lecture 1
8
Desired signal d(n) Input signal u(n) -
Filter w
Output signal + ? y(n) -
Σ
–
-
Estimation error e(n) = d(n)−y(n)
ˆ ˜T The filter w = w0 w1 w2 . . . which minimizes the estimation error e(n), such that the output signal y(n) resembles the desired signal d(n) as much as possible is searched for.
Adaptive Signal Processing 2011
Lecture 1
Optimal Linear Filtrering
9
In order to determine the optimal filter a cost function J , which punish the deviation e(n), is introduced. The larger e(n), the higher cost. From OSB you know some different strategies, e.g.,
Optimal Linear Filtrering
10 p
The cost function J(n) = E{|e(n)| } can be used for any p ≥ 1, but most oftenly for p = 2. This choice gives a convex cost function which is refered to as the Mean Squared Error.
J = E{e(n)e∗ (n)} = E{|e(n)|2} MSE
• The total squared error (LS) Deterministic description of the signal. n2 X 2 J= e (n) n1
• Mean squared error (MS) Stochastic description of the signal. J = E{|e(n)|2} • Mean squared error with extra contraint 2
J = E{|e(n)| }+λ|u(n)| Adaptive Signal Processing 2011
2
Lecture 1
Optimal Linear Filtrering
11
In order to find the optimal filter coefficients J is minimized with regard to themselves. This is done by differentiating J with regard to w0 , w1 , . . ., and then by setting the derivative to zero. Here, it is important that the cost function is convex, i.e., so that there is a global minimum. The minimization is expressed in terms of the gradient operator ∇,
∇J = 0
Adaptive Signal Processing 2011
Optimal Linear Filtrering
H
H
∗
J(w) = E{[d(n)− w u(n)][d(n)− w u(n)] } = σd2 − wH p − pH w + wH Rw d¨ar ˆ w = w0
w1
...
wM −1
u(n − 1)
... 2
In particular, the choice of the squared cost function Mean Squared Error leads to the Wiener-Hopf equation-system.
12
In matrix form, the cost function J = E{|e(n)|2} can be written
ˆ u(n) = u(n)
where ∇J is called gradient vector.
Lecture 1
R = E{u(n)u
H
˜T
M ×1
u(n − M + 1)
r(0) r∗ (1) .. .
6 6 (n)} = 6 4 r∗ (M − 1)
∗
ˆ
p = E{u(n)d (n)} = p(0)
p(−1)
r(1) r(0) .. .
˜T
...
r∗ (M − 2) ...
M ×1
.. . ...
3 r(M − 1) r(M − 2)7 7 7 .. 5 .
p(−(M − 1))
r(0)
˜T
M ×1
2 ∗ σd = E{d(n)d (n)} Adaptive Signal Processing 2011
Lecture 1
Adaptive Signal Processing 2011
Lecture 1
Optimal Linear Filtrering
13
The gradient operator yields
∇J(w) = 2
Optimal Linear Filtrering
The cost function’s dependence on the filter coefficients w can be made clear if written in canonical form
∂ ∂J(w) = 2 ∗ (σd2 − wHp − pHw + wH Rw) ∂ w∗ ∂w
J(w) = σd2 − wH p − pH w + wH Rw
= −2p + 2Rw If the gradient vector is set to zero, the Wiener-Hopf equation system results Rwo = p Wiener-Hopf which solution is the Wiener filter. −1
wo = R
p
14
=
H
−1
H
p +(w − wo ) R(w − wo )
Here, Wiener-Hopf and the expression of the Wienerfilter have been used in addition to the fact that the following decomposition can be made H
Wienerfilter
2
σd − p R
H
H
H
H
w Rw = (w − wo) R(w − wo )− wo Rwo + wo Rw + w Rwo
In other words, the Wiener filter is optimal when the cost is controlled by MSE.
With the optimal filter w = wo the minimal error Jmin is achieved:
Jmin ≡ J(wo ) = σd2 − pH R−1 p MMSE Adaptive Signal Processing 2011
Lecture 1
Optimal Linear Filtrering
15
Adaptive Signal Processing 2011
Steepest Descent
Lecture 1
16
Error-Performance Surface f¨or FIR-filter with two coefficients, w = [w0 , w1 ]T 14
4 12
3.5
15
5
8
2
1.5 1
0 4
−4 −3
3
−2
2
w0 ˆ p = 0.5272 ˆ wo = 0.8360
−1
1 0
0
w1
˜T −0.4458
The method of the Steepest descent is a recursive method to find the Wienerfiltret when the statistics of the signals are known.
10
2.5
10
w0
J(w)
3
6
wo
The method of the Steepest descent is not an adaptive filter, but serves as a basis for the LMS algorithm which is presented in Lecture 2.
4
0.5 0 0
R=
2
−1
» 1.1 0.5
−2
w1
−3
−4
– 0.5 σd2 = 0.9486 1.1
˜T −0.7853 Jmin = 0.1579
Adaptive Signal Processing 2011
Lecture 1
Adaptive Signal Processing 2011
Lecture 1
Steepest Descent
17
The method of the Steepest descent is a recursive method that leads to the Wiener-Hopfs equations. The statistics are known (R, p). The purpose is to avoid inversion of R. (saves computations)
Convergence, filter coefficients
Since the method of the Steepest Descent contains feedback, there is a risk that the algorithm diverges. This limits the choices of the stepsize parameter µ. One example of the critical choice of µ is given below. The statistics are the same as in the previous example. µ = 1.5
• Determine the gradient ∇J(n) that points in the direction in which the cost function increases the most. ∇J(n) = −2p + 2Rw(n) • Adjust w(n + 1) in the opposite direction to the gradient, but weight down the adjustment with the stepsize parameter µ 1 w(n+1) = w(n) + µ[−∇J(n)] 2
Adaptive Signal Processing 2011
Lecture 1
Convergence, error surface
19
The influence of the stepsize parameter on the convergence can be seen when analyzing J(w). The example below illustrates the convergence towards Jmin for different choices of µ. 2 1.5 w(0)
wo
w0
0.5 0
µ = 0.1 µ = 0.5 µ = 1.0
-0.5 -1 -1.5 -2
-2 -1.5 -1 -0.5
0 0.5 w1
Adaptive Signal Processing 2011
1
1.5
»
– 0.5272 −0.4458 » – 1.1 0.5 R= 0.5 1.1 » – 0.8360 wo = −0.7853 » – 1 w(0) = 1.7 p=
µ = 1.0 µ = 1.25
»
– 0.5272 −0.4458 » – 1.1 0.5 R= 0.5 1.1 » – 0.8360 wo = −0.7853 » – 0 w(0) = 0 p=
µ = 0.1 0
wo 1
0
• Repete steps 2 and 3.
1
wo 0
w(n)
• Set start values for the filter coefficients, w(0) (n = 0)
18
20
40
60
80
Iteration n
100
Adaptive Signal Processing 2011
Lecture 1
Convergence analysis
20
How should µ be chosen? A small value gives slow convergence, while a large value constitutes a risk for divergence. Perform an eigenvalue decomposition of R in the expression of J(w(n))
J(n) = Jmin +(w(n)− wo )H R(w(n)− wo ) H
H
= Jmin + (w(n)− wo) QΛQ (w(n)− wo) X H 2 = Jmin + ν (n)Λν(n) = Jmin + λk |νk (n)| k
The convergence of the cost function depends on ν(n), i.e., the convergence of w(n) through the relationship ν(n) = QH (w(n) − wo ).
2
Lecture 1
Adaptive Signal Processing 2011
Lecture 1
Convergence analysis
21
With the observation that w(n) = Qν(n)+wo the update of the cost function can be derived: w(n+1) = w(n)+µ[p − Rw(n)]
Convergence, time constants
The time constants indicates how many iterations it takes until the respective error has decreased by the factor e−1 , where e denotes the base of the natural logarithm. The smaller time constant, the better. The time constant τk for eigenmode k (eigenvalueλk ) is
Qν(n+1)+ wo = Qν(n)+ wo +µ[p − RQν(n) − Rwo ] H
ν(n+1) = ν(n) − µQ RQν(n) = (I − µΛ)ν(n) νk (n + 1) = (1 − µλk )νk (n)
22
τk =
( Element k i ν(n) )
µ