Welcome to Adaptive Signal Processing! Lectures and exercises ...

21 downloads 124 Views 253KB Size Report
Sign up on lists on webpage from Monday Nov 1. Adaptive Signal Processing 2011. Lecture 1. Course literature. 3. Book: Simon Haykin, Adaptive Filter Theory,  ...
Welcome to Adaptive Signal Processing!

1

From Merriam-Webster’s Collegiate Dictionary: Main Entry: ad·ap·ta·tion Pronunciation: “a-”dap-’tA-sh&n, -d&pFunction: noun Date: 1610 1 : the act or process of adapting : the state of being adapted 2 : adjustment to environmental conditions: as a : adjustment of a sense organ to the intensity or quality of stimulation b : modification of an organism or its parts that makes it more fit for existence under the conditions of its environment 3 : something that is adapted; specifically : a composition rewritten into a new form - ad·ap·ta·tion·al /-shn&l, -sh&-n&l/ adjective - ad·ap·ta·tion·al·ly adverb

Adaptive Signal Processing 2011

Lecture 1

Course literature Book:

3

Simon Haykin, Adaptive Filter Theory, 4th edition, Prentice-Hall, 2001. ISBN: 0-13-090126-1 (Hardcover) Kapitel: Backgr., (2), 4, 5, 6, 7, 8, 9, 13.2, 14.1 (3:e edition: Intr., 1, (5), 8, 9, 10, 13, 16.1, 17.2 )

Exercise material:Exercise compendium (course home page)

Lectures and exercises Lectures:

Tuesdays 08.15-10.00 in room E:1406

Exercises:

Wednesdays 08.15-10.00 in room E:1145

2

Computer exercises: Wednesdays 13.15-15.00 in room E:4115, or Thursdays 10.15-12.00 in room E:4115 Laborations:

Lab I: Adaptive channel equalizer in room E:4115 Lab II: Adaptive filter on a DSP in room E:4115 Sign up on lists on webpage from Monday Nov 1.

Adaptive Signal Processing 2011

Lecture 1

Contents - References in the 4:th edition

4

Vecka 1: Repetition of OSB (Hayes, or chap.2), The method of the Steepest descent (chap.4) Vecka 2: The LMS algorithm (chap.5) Vecka 3: Modified LMS-algorithms (chap.6) Vecka 4: Freqency adaptive filters (chap.7) Vecka 5: The RLS algoritm (chap.8–9)

Computer exercises (course home page)

Vecka 6: Tracking and implementation aspects (chap.13.2, 14.1)

Laborations (course home page)

Vecka 7: Summary

Other material: Lecture notes (course home page) Matlab code (course home page) Adaptive Signal Processing 2011

Lecture 1

Adaptive Signal Processing 2011

Lecture 1

Contents - References in the 3:rd edition

5

Lecture 1

6

This lecture deals with

Vecka 1: Repetition of OSB (Hayes, or chap.5), The method of the Steepest descent (chap.8)

• Repetition of the course Optimal signal processing (OSB)

Vecka 2: The LMS algorithm (chap.9)

• The method of the Steepest descent

Vecka 3: Modified LMS-algorithms (chap.9) Vecka 4: Freqency adaptive filters (chap.10, 1) Vecka 5: The RLS algoritm (chap.11) Vecka 6: Tracking and implementation aspects (chap.16.1, 17.2) Vecka 7: Summary

Adaptive Signal Processing 2011

Recap of Optimal signal processing (OSB)

Lecture 1

7

Adaptive Signal Processing 2011

Optimal Linear Filtrering

The following problems were treated in OSB

• Signal modeling Either a model with both poles and zeros or a model with only poles (vocal tract) or only zeros (lips). • Invers filter of FIR type Deconvolution or equalization of a channel. • Wiener filter Filtrering, equalization, prediction och deconvolution.

Adaptive Signal Processing 2011

Lecture 1

Lecture 1

8

Desired signal d(n) Input signal u(n) -

Filter w

Output signal + ?  y(n) -

Σ

– 

-

Estimation error e(n) = d(n)−y(n)

ˆ ˜T The filter w = w0 w1 w2 . . . which minimizes the estimation error e(n), such that the output signal y(n) resembles the desired signal d(n) as much as possible is searched for.

Adaptive Signal Processing 2011

Lecture 1

Optimal Linear Filtrering

9

In order to determine the optimal filter a cost function J , which punish the deviation e(n), is introduced. The larger e(n), the higher cost. From OSB you know some different strategies, e.g.,

Optimal Linear Filtrering

10 p

The cost function J(n) = E{|e(n)| } can be used for any p ≥ 1, but most oftenly for p = 2. This choice gives a convex cost function which is refered to as the Mean Squared Error.

J = E{e(n)e∗ (n)} = E{|e(n)|2} MSE

• The total squared error (LS) Deterministic description of the signal. n2 X 2 J= e (n) n1

• Mean squared error (MS) Stochastic description of the signal. J = E{|e(n)|2} • Mean squared error with extra contraint 2

J = E{|e(n)| }+λ|u(n)| Adaptive Signal Processing 2011

2

Lecture 1

Optimal Linear Filtrering

11

In order to find the optimal filter coefficients J is minimized with regard to themselves. This is done by differentiating J with regard to w0 , w1 , . . ., and then by setting the derivative to zero. Here, it is important that the cost function is convex, i.e., so that there is a global minimum. The minimization is expressed in terms of the gradient operator ∇,

∇J = 0

Adaptive Signal Processing 2011

Optimal Linear Filtrering

H

H



J(w) = E{[d(n)− w u(n)][d(n)− w u(n)] } = σd2 − wH p − pH w + wH Rw d¨ar ˆ w = w0

w1

...

wM −1

u(n − 1)

... 2

In particular, the choice of the squared cost function Mean Squared Error leads to the Wiener-Hopf equation-system.

12

In matrix form, the cost function J = E{|e(n)|2} can be written

ˆ u(n) = u(n)

where ∇J is called gradient vector.

Lecture 1

R = E{u(n)u

H

˜T

M ×1

u(n − M + 1)

r(0) r∗ (1) .. .

6 6 (n)} = 6 4 r∗ (M − 1)



ˆ

p = E{u(n)d (n)} = p(0)

p(−1)

r(1) r(0) .. .

˜T

...

r∗ (M − 2) ...

M ×1

.. . ...

3 r(M − 1) r(M − 2)7 7 7 .. 5 .

p(−(M − 1))

r(0)

˜T

M ×1

2 ∗ σd = E{d(n)d (n)} Adaptive Signal Processing 2011

Lecture 1

Adaptive Signal Processing 2011

Lecture 1

Optimal Linear Filtrering

13

The gradient operator yields

∇J(w) = 2

Optimal Linear Filtrering

The cost function’s dependence on the filter coefficients w can be made clear if written in canonical form

∂ ∂J(w) = 2 ∗ (σd2 − wHp − pHw + wH Rw) ∂ w∗ ∂w

J(w) = σd2 − wH p − pH w + wH Rw

= −2p + 2Rw If the gradient vector is set to zero, the Wiener-Hopf equation system results Rwo = p Wiener-Hopf which solution is the Wiener filter. −1

wo = R

p

14

=

H

−1

H

p +(w − wo ) R(w − wo )

Here, Wiener-Hopf and the expression of the Wienerfilter have been used in addition to the fact that the following decomposition can be made H

Wienerfilter

2

σd − p R

H

H

H

H

w Rw = (w − wo) R(w − wo )− wo Rwo + wo Rw + w Rwo

In other words, the Wiener filter is optimal when the cost is controlled by MSE.

With the optimal filter w = wo the minimal error Jmin is achieved:

Jmin ≡ J(wo ) = σd2 − pH R−1 p MMSE Adaptive Signal Processing 2011

Lecture 1

Optimal Linear Filtrering

15

Adaptive Signal Processing 2011

Steepest Descent

Lecture 1

16

Error-Performance Surface f¨or FIR-filter with two coefficients, w = [w0 , w1 ]T 14

4 12

3.5

15

5

8

2

1.5 1

0 4

−4 −3

3

−2

2

w0 ˆ p = 0.5272 ˆ wo = 0.8360

−1

1 0

0

w1

˜T −0.4458

The method of the Steepest descent is a recursive method to find the Wienerfiltret when the statistics of the signals are known.

10

2.5

10

w0

J(w)

3

6

wo

The method of the Steepest descent is not an adaptive filter, but serves as a basis for the LMS algorithm which is presented in Lecture 2.

4

0.5 0 0

R=

2

−1

» 1.1 0.5

−2

w1

−3

−4

– 0.5 σd2 = 0.9486 1.1

˜T −0.7853 Jmin = 0.1579

Adaptive Signal Processing 2011

Lecture 1

Adaptive Signal Processing 2011

Lecture 1

Steepest Descent

17

The method of the Steepest descent is a recursive method that leads to the Wiener-Hopfs equations. The statistics are known (R, p). The purpose is to avoid inversion of R. (saves computations)

Convergence, filter coefficients

Since the method of the Steepest Descent contains feedback, there is a risk that the algorithm diverges. This limits the choices of the stepsize parameter µ. One example of the critical choice of µ is given below. The statistics are the same as in the previous example. µ = 1.5

• Determine the gradient ∇J(n) that points in the direction in which the cost function increases the most. ∇J(n) = −2p + 2Rw(n) • Adjust w(n + 1) in the opposite direction to the gradient, but weight down the adjustment with the stepsize parameter µ 1 w(n+1) = w(n) + µ[−∇J(n)] 2

Adaptive Signal Processing 2011

Lecture 1

Convergence, error surface

19

The influence of the stepsize parameter on the convergence can be seen when analyzing J(w). The example below illustrates the convergence towards Jmin for different choices of µ. 2 1.5 w(0)

wo

w0

0.5 0

µ = 0.1 µ = 0.5 µ = 1.0

-0.5 -1 -1.5 -2

-2 -1.5 -1 -0.5

0 0.5 w1

Adaptive Signal Processing 2011

1

1.5

»

– 0.5272 −0.4458 » – 1.1 0.5 R= 0.5 1.1 » – 0.8360 wo = −0.7853 » – 1 w(0) = 1.7 p=

µ = 1.0 µ = 1.25

»

– 0.5272 −0.4458 » – 1.1 0.5 R= 0.5 1.1 » – 0.8360 wo = −0.7853 » – 0 w(0) = 0 p=

µ = 0.1 0

wo 1

0

• Repete steps 2 and 3.

1

wo 0

w(n)

• Set start values for the filter coefficients, w(0) (n = 0)

18

20

40

60

80

Iteration n

100

Adaptive Signal Processing 2011

Lecture 1

Convergence analysis

20

How should µ be chosen? A small value gives slow convergence, while a large value constitutes a risk for divergence. Perform an eigenvalue decomposition of R in the expression of J(w(n))

J(n) = Jmin +(w(n)− wo )H R(w(n)− wo ) H

H

= Jmin + (w(n)− wo) QΛQ (w(n)− wo) X H 2 = Jmin + ν (n)Λν(n) = Jmin + λk |νk (n)| k

The convergence of the cost function depends on ν(n), i.e., the convergence of w(n) through the relationship ν(n) = QH (w(n) − wo ).

2

Lecture 1

Adaptive Signal Processing 2011

Lecture 1

Convergence analysis

21

With the observation that w(n) = Qν(n)+wo the update of the cost function can be derived: w(n+1) = w(n)+µ[p − Rw(n)]

Convergence, time constants

The time constants indicates how many iterations it takes until the respective error has decreased by the factor e−1 , where e denotes the base of the natural logarithm. The smaller time constant, the better. The time constant τk for eigenmode k (eigenvalueλk ) is

Qν(n+1)+ wo = Qν(n)+ wo +µ[p − RQν(n) − Rwo ] H

ν(n+1) = ν(n) − µQ RQν(n) = (I − µΛ)ν(n) νk (n + 1) = (1 − µλk )νk (n)

22

τk =

( Element k i ν(n) )

µ

Suggest Documents