EE2-4: Communication Systems 4: Communication Systems

EE2--4: Communication Systems EE2 Dr. Cong Ling Department of Electrical and Electronic Engineering

1

Course Information •

Lecturer: Dr. Cong Ling (Senior Lecturer) – Office: Room 815, EE Building – Phone: 020-7594 020 7594 6214 – Email: [email protected]

•

Handouts – Slides: exam is based on slides – Notes: contain more details – Problem sheets

•

Course homepage – http://www.commsp.ee.ic.ac.uk/~cling – You Y can access llecture t slides/problem lid / bl sheets/past h t / t papers – Also available in Blackboard

•

Grading g – 1.5-hour exam, no-choice, closed-book

2

Lectures Introduction and background 1. Introduction 2. Probability and random processes 3 Noise 3. Effects of noise on analog communications 4. Noise performance of DSB 5. Noise performance of SSB and AM 6. Noise performance of FM 7. Pre/de-emphasis p for FM and comparison of analog systems

Digital communications 8. Digital representation of signals 9. Baseband digital transmission 10. Digital modulation 11 Noncoherent 11. N h td demodulation d l ti Information theory 12 Entropy and source coding 12. 13. Channel capacity 14. Block codes 15. Cyclic codes

3

EE2--4 vs. EE1EE2 EE1-6 • Introduction to Signals and Communications – How do communication systems work? – About modulation, demodulation, signal analysis... – The main mathematical tool is the Fourier transform for deterministic signal analysis. – More about analog communications (i.e., signals are continuous).

• Communications C – How do communication systems perform in the presence of noise? – About statistical aspects and noise. noise • This is essential for a meaningful comparison of various communications systems.

– The main mathematical tool is probability. probability – More about digital communications (i.e., signals are discrete).

4

Learning Outcomes • Describe a suitable model for noise in communications • Determine the signal signal-to-noise to noise ratio (SNR) performance of analog communications systems • Determine the probability of error for digital communications systems • Understand information theoryy and its significance g in determining system performance • Compare the performance of various communications systems

5

About the Classes •

You’re welcome to ask questions. – You can interrupt p me at any y time. – Please don’t disturb others in the class.

• • •

Our responsibility is to facilitate you to learn. You have to make the effort effort. Spend time reviewing lecture notes afterwards. If yyou have a question q on the lecture material after a class, then – Look up a book! Be resourceful. – Try to work it out yourself. yourself – Ask me during the problem class or one of scheduled times of availability.

6

References • • •

C. Ling, Notes of Communication Systems, Imperial Collage. S Haykin & M S. M. Moher Moher, Communication Systems, 5th ed., International Student Version, Wiley, 2009 (£43.99 from Wiley) y , Communication Systems, y , 4th ed.,, S. Haykin, Wiley, 2001 – Owns the copyright of many figures in these slides. – For F convenience, i th note the t “ 2000 Wil Wiley, th Haykin/Communication System, 4 ed.” is not shown for each figure.

• • •

B.P. B P Lathi, Lathi Modern Digital and Analog Communication Systems, 3rd ed., Oxford University Press, 1998 J.G. Proakis and M. Salehi,, Communication Systems Engineering, Prentice-Hall, 1994 L.W. Couch II, Digital and Analog Communication Systems, 6th ed., PrenticeH ll 2001 Hall, 7

Multitude of Communications • • • • • • •

Telephone network Internet Radio and TV broadcast Mobile communications Wi-Fi Satellite and space communications Smart power grid, healthcare…

• Analogue communications – AM, FM

• Digital communications – Transfer of information in digits – Dominant technology today – Broadband, 3G, DAB/DVB 8

What’s Communications? • Communication involves the transfer of information from one point to another. • Three basic elements – Transmitter: converts message into a form suitable for transmission – Channel: the physical medium, introduces distortion, noise, interference – Receiver: R i reconstruct t t a recognizable i bl fform off the th message

Speech Music Pictures Data … 9

Communication Channel • The channel is central to operation of a communication y system – Linear (e.g., mobile radio) or nonlinear (e.g., satellite) – Time invariant (e.g., fiber) or time varying (e.g., mobile radio)

• The information-carrying capacity of a communication system is proportional to the channel bandwidth • Pursuit for wider bandwidth – – – –

Copper wire: 1 MHz Coaxial cable: 100 MHz Microwave: GHz Optical fiber: THz • Uses light as the signal carrier • Highest capacity among all practical signals

10

Noise in Communications • Unavoidable presence of noise in the channel – Noise refers to unwanted waves that disturb communications – Signal is contaminated by noise along the path.

• External noise: interference from nearby channels, humanmade noise, natural noise... • Internal noise: thermal noise, random emission... in electronic devices • Noise is one of the basic factors that set limits on communications. i ti • A widely used metric is the signal-to-noise (power) ratio (SNR) signal power SNR= noise p power 11

Transmitter and Receiver • The transmitter modifies the message signal into a form suitable for transmission over the channel • This modification often involves modulation – Moving the signal to a high-frequency carrier (up-conversion) and varying some parameter of the carrier wave – Analog: AM, FM, PM – Digital: ASK ASK, FSK, FSK PSK (SK: shift keying)

• The receiver recreates the original message by demodulation – Recovery is not exact due to noise/distortion – The resulting degradation is influenced by the type of modulation

• Design of analog communication is conceptually simple g communication is more efficient and reliable; design g • Digital is more sophisticated 12

Objectives of System Design • Two primary resources in communications – Transmitted p power ((should be g green)) – Channel bandwidth (very expensive in the commercial market)

• In certain scenarios, one resource may be more important than the other – Power limited (e.g. deep-space communication) – Bandwidth limited (e.g. ( telephone circuit))

• Objectives of a communication system design – The message is delivered both efficiently and reliably reliably, subject to certain design constraints: power, bandwidth, and cost. – Efficiency is usually measured by the amount of messages sent in unit power, unit time and unit bandwidth. – Reliability is expressed in terms of SNR or probability of error.

13

Information Theory • •

In digital communications, is it possible to operate at zero error rate even though the channel is noisy? Poineers: Shannon, Kolmogorov… – The maximum rate of reliable transmission is calculated. – The famous Shannon capacity formula for a channel with bandwidth W (Hz) C = W log(1+SNR) bps (bits per second)

– Zero error rate is possible as long as actual signaling rate is less than C.

•

Shannon

Many concepts M t were fundamental f d t l and d paved d th the way for future developments in communication theory. – Provides a basis for tradeoff between SNR and bandwidth, and for comparing different communication schemes.

Kolmogorov 14

Milestones in Communications • • • •

1837, Morse code used in telegraph 1864 Maxwell formulated the eletromagnetic (EM) theory 1864, 1887, Hertz demonstrated physical evidence of EM waves 1890’s-1900’s 1890 s-1900 s, Marconi & Popov, Popov long-distance radio telegraph – Across Atlantic Ocean – From Cornwall to Canada

• 1875, Bell invented the telephone • 1906, radio broadcast • 1918, Armstrong invented superheterodyne radio receiver (and FM in 1933) • 1921, land-mobile communication 15

Milestones (2) • • • • • • • • • •

1928, Nyquist proposed the sampling theorem 1947,, microwave relayy system y 1948, information theory 1957, era of satellite communication began 1966, Kuen Kao pioneered fiber-optical communications (Nobel Prize Winner) 1970’ era off computer 1970’s, t networks t k b began 1981, analog cellular system 1988 digital cellular system debuted in Europe 1988, 2000, 3G network The big 3 telecom manufacturers in 2010

16

Cellular Mobile Phone Network • A large area is partitioned into cells • Frequency reuse to maximize capacity

17

Growth of Mobile Communications • 1G: analog communications – AMPS

• 2G: digital communications – GSM – IS-95

• 3G: CDMA networks – WCDMA – CDMA2000 – TD-SCDMA TD SCDMA

• 4G: data rate up to 1 Gbps (giga bits per second) – Pre-4G technologies: WiMax, 3G LTE 18

Wi--Fi Wi • Wi-Fi connects “local” computers (usually within 100m g ) range)

19

IEEE 802.11 WiWi-Fi Standard • 802.11b – Standard for 2.4GHz ((unlicensed)) ISM band – 1.6-10 Mbps, 500 ft range

• 802.11a – Standard for 5GHz band – 20-70 Mbps, variable range – Similar to HiperLAN in Europe

• 802.11g – Standard in 2.4 GHz and 5 GHz bands – Speeds up to 54 Mbps, based on orthogonal frequency division multiplexing (OFDM)

• 802.11n – Data rates up p to 600 Mbps p – Use multi-input multi-output (MIMO) 20

Satellite/Space Communication •

Satellite communication – Cover very large areas – Optimized for one-way transmission • Radio (DAB) and movie (SatTV) broadcasting

– Two-way systems • The only choice for remote-area and maritime communications • Propagation delay (0.25 s) is uncomfortable in voice communications

•

Space communication – – – –

Missions to Moon, Mars, … Long distance, weak signals High-gain antennas Powerful error-control coding

21

Future Wireless Networks Ubiquitous Communication Among People and Devices Wireless Wi l IInternet t t access Nth generation Cellular Ad Hoc Networks Sensor Networks Wireless Entertainment Smart Homes/Grids Automated Highways All this and more… •Hard Delay Constraints •Hard Energy Constraints

22

Communication Networks • Today’s communications networks are complicated systems – A large n number mber of users sers sharing the medi medium m – Hosts: devices that communicate with each other – Routers: route date through the network

23

Concept of Layering • Partitioned into layers, each doing a relatively simple task • Protocol stack

Application

Network

Transport Network Link Ph i l Physical

OSI Model

TCP/IP protocol stack (Internet)

Physical

2-layer model

Communication Systems mostly deals with the physical layer layer, but some techniques (e.g., coding) can also be applied to the network layer.

24

EE2--4: Communication Systems EE2

Lecture 2: Probability and Random Processes Dr. Cong Ling Department of Electrical and Electronic Engineering

25

Outline • Probability – – – – –

How probability is defined cdf and pdf Mean and variance Joint distribution Central limit theorem

• Random processes p – Definition – Stationary random processes – Power spectral density

• References – Notes of Communication Systems, y , Chap. p 2.3. – Haykin & Moher, Communication Systems, 5th ed., Chap. 5 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap 11 Chap. 26

Why Probability/Random Process? • Probability is the core mathematical tool for communication y theory. • The stochastic model is widely used in the study of communication systems. • Consider a radio communication system where the received signal is a random process in nature: – – – –

Message is random. No randomness, no information. Interference is random. Noise is a random process. process And many more (delay, phase, fading, ...)

• Other real-world applications of probability and random processes include – Stock market modelling, g g gambling g ((Brown motion as shown in the previous slide, random walk)… 27

Probabilistic Concepts • What is a random variable (RV)? – It is a variable that takes its values from the outputs p of a random experiment.

• What is a random experiment? – It is an experiment the outcome of which cannot be predicted precisely. – All possible identifiable outcomes of a random experiment constitute its sample space S. – An event is a collection of possible outcomes of the random experiment. i

• Example – For tossing a coin coin, S = { H H, T } – For rolling a die, S = { 1, 2, …, 6 }

28

Probability Properties • PX(xi): the probability of the random variable X taking on the value xi • The probability of an event to happen is a non-negative number, with the following properties: – The probability of the event that includes all possible outcomes of the experiment is 1. – The probability of two events that do not have any common outcome is the sum of the probabilities of the two events separately.

• Example – Roll a die:

PX(x = k) = 1/6

for k = 1, 2, …, 6

29

CDF and PDF •

The (cumulative) distribution function (cdf) of a random variable X is defined as the probability of X taking a value less than the argumentt x:

FX ( x)  P( X  x)

•

Properties

FX ()  0, FX ()  1 FX ( x1 )  FX ( x2 ) if x1  x2

•

The probability density function (pdf) is defined as the derivative of the distribution function: f X ( x) 

dF X ( x ) dx x

FX ( x ) 



f X ( y ) dy

 b

P ( a  X  b )  FX (b )  FX ( a ) 



f X ( y ) dy

a

f X ( x) 

dF X ( x ) dx

 0 since F X ( x ) is non - decreasing 30

Mean and Variance •

If  x is sufficiently small,

P ( x  X  x  x ) 

x  x

f X ( y)

 f X ( y ) dy  f X ( x )  x

x

Area

f X ( x)

x y

x •

Mean (or expected value  DC level):

E[ X ]   X  •







x f X ( x ) dx

E[ ]: expectation operator

Variance ( power for zero-mean signals):

 X2  E[( X   X ) 2 ] 





( x   X ) 2 f X ( x)dx  E[ X 2 ]   X2



31

Normal (Gaussian) Distribution f X ( x)

 m

0

f X ( x)  FX ( x ) 

1 2  1 2 

e



( x m )2 2 2

x 

e



( y m )2 2 2

f  x for dy

x

E[ X ]  m

X2  2  : rm s v a lu e

32

Uniform Distribution fX(x)

 1  f X ( x)   b  a  0 0 xa  FX ( x )   b  a 1

a xb elsewhere xa a xb

ab 2 (b  a ) 2  12

E[ X ] 

X2

xb 33

Joint Distribution • Joint distribution function for two random variables X and Y FXY ( x , y )  P ( X  x , Y  y ) • Joint probability density function

f XY ( x, y ) 

• Properties

 2 FXY ( x , y ) xy

 

1) FXY (, ) 



f XY (u , v)dudv d d 1

  

2)

f X ( x) 



f XY ( x, y ) dy

y  

3)

fY ( x ) 



f XY ( x, y ) dx

x 

4)

X , Y are independent  f XY ( x, y )  f X ( x) fY ( y )

5)

X , Y are uncorrelated  E[ XY ]  E[ X ]E[Y ] 34

Independent vs. Uncorrelated • Independent implies Uncorrelated (see problem sheet) • Uncorrelated does not imply Independence • For normal RVs (jointly Gaussian), Uncorrelated implies Independent (this is the only exceptional case!) • An example of uncorrelated but dependent RV’s Let  be uniformly y distributed in [0,2 ]

f ( x)  21

for 0  x  2

Y

Define RV’s X and Y as X  cos  Y  sin  Clearly, y, X and Y are not independent. p But X and Y are uncorrelated:

E[ XY ] 

1 2



2

0

Locus of X and Y X

cos  sin  d  0! 35

Joint Distribution of n RVs • Joint cdf FX1 X 2 ... X n ( x1 , x2 ,...xn )  P( X 1  x1 , X 2  x2 ,... X n  xn )

• Joint pdf

f X 1 X 2 ... X n ( x1 , x2 ,...xn ) 

 n FX1X 2 ... X n ( x1 , x2 ,... xn ) x1x2 ...xn

• Independent FX 1 X 2 ... X n ( x1 , x2 ,...xn )  FX 1 ( x1 ) FX 2 ( x2 )...FX n ( xn ) f X 1 X 2 ... X n ( x1 , x2 ,...xn )  f X 1 ( x1 ) f X 2 ( x2 )... ) f X n ( xn )

• i.i.d. (independent, identically distributed) – The random variables are independent and have the same distribution. – Example: outcomes from repeatedly flipping a coin. 36

Central Limit Theorem • For i.i.d. random variables, z = x1 + x2 +· · ·+ xn tends to Gaussian as n goes to infinity. • Extremely useful in communications. • That’s why noise is usually Gaussian. We often say “Gaussian Gaussian noise” noise or “Gaussian channel” in communications.

x1

x1 + x2 + x3

x1 + x2

x1 + x2+ x3 + x4

Illustration of convergence to Gaussian distribution 37

What is a Random Process? • A random process is a time-varying function that assigns the outcome of a random experiment to each time instant: X(t). • For a fixed (sample path): a random process is a time varying function, e.g., a signal. • For fixed t: a random p process is a random variable. • If one scans all possible outcomes of the underlying random experiment, we shall get an ensemble of signals. • Noise can often be modelled as a Gaussian random process.

38

An Ensemble of Signals

39

Statistics of a Random Process • For fixed t: the random process becomes a random variable, with mean 

 X (t )  E[ X (t )]   x f X ( x; t )dx 

– In general general, the mean is a function of t. t

• Autocorrelation function

RX (t1 , t2 )  E[ X (t1 ) X (t2 )]  







 

xy f X ( x, y; t1 , t2 )dxdy

– In general, the autocorrelation function is a two-variable function. – It measures the correlation between two samples.

40

Stationary Random Processes • A random process is (wide-sense) stationary if – Its mean does not depend p on t

 X (t )   X – Its autocorrelation function only depends on time difference

RX (t, t  )  RX ( )

• In communications,, noise and message g signals g can often be modelled as stationary random processes. 41

Example • Show that sinusoidal wave with random phase X (t )  A cos((  c t   ) with phase Θ uniformly distributed on [0,2π] is stationary. – Mean is a constant: 1 2 f ( )  ,  [0 [0, 2 ] 1  X (t )  E[ X (t )]   A cos(ct   ) d  0 2 0 2 – Autocorrelation function only depends on the time difference: RX (t , t   )  E[ X (t ) X (t   )]  E[ A2 cos(c t  ) cos(c t  c  )] A2 A2  E[cos(2c t  c  2)]  E[cos(c )] 2 2 A2 2  1 A2  cos(2c t  c  2 ) d  cos(c )  0 2 2 2 A2 RX ( )  cos(c ) 2 42

Power Spectral Density • Power spectral density (PSD) is a function that measures the distribution of power of a random process with frequency. • PSD is only defined for stationary processes. • Wiener-Khinchine Wiener Khinchine relation: The PSD is equal to the Fourier transform of its autocorrelation function: 

S X ( f )   R X ( )e  j 2f d 

– A similar relation exists for deterministic signals

• Then the average power can be found as  2 P  E[ X (t )]  R X (0)   S X ( f )df  • The Th frequency f content off a process depends d d on how h rapidly the amplitude changes as a function of time. – This can be measured by the autocorrelation function function. 43

Passing Through a Linear System

• L Lett Y(t) obtained bt i d b by passing i random d process X(t) th through h a linear system of transfer function H(f). Then the PSD of Y(t) 2 (2.1) SY ( f )  H ( f ) S X ( f ) – Proof: see Notes 3 3.4.2. 42 – Cf. the similar relation for deterministic signals

• If X(t) ( ) is a Gaussian p process,, then Y(t) ( ) is also a Gaussian process. – Gaussian processes are very important in communications. 44


Lecture 3: Noise Dr. Cong Ling Department of Electrical and Electronic Engineering

45

Outline • • • •

What is noise? White noise and Gaussian noise Lowpass noise Bandpass noise – In-phase/quadrature representation – Phasor representation p

• References – Notes of Communication Systems, Chap. 2. – Haykin & Moher, Communication Systems, 5th ed., Chap. 5 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 11

46

Noise • Noise is the unwanted and beyond our control waves that g disturb the transmission of signals. • Where does noise come from? – External sources: e.g., atmospheric, galactic noise, interference; – Internal sources: generated by communication devices themselves. • This type of noise represents a basic limitation on the performance of electronic communication systems systems. • Shot noise: the electrons are discrete and are not moving in a continuous steady flow, so the current is randomly fluctuating. • Thermal Th l noise: i caused db by th the rapid id and d random d motion ti off electrons l t within a conductor due to thermal agitation.

• Both are often stationaryy and have a zero-mean Gaussian distribution (following from the central limit theorem).

47

White Noise • The additive noise channel – n(t) ( ) models all types yp of noise – zero mean

• White noise – Its power spectrum density (PSD) is constant over all frequencies, i.e., N   f   SN ( f )  0 , 2 – Factor 1/2 is included to indicate that half the power is associated with positive frequencies and half with negative. – The term white is analogous to white light which contains equal amounts of all frequencies (within the visible band of EM wave). – It It’ss only defined for stationary noise noise.

• An infinite bandwidth is a purely theoretic assumption.

48

White vs. Gaussian Noise • White noise

SN(f)

PSD

Rn()

N

– Autocorrelation function of n(t ) : Rn ( )  0  ( ) 2 – Samples S l att diff differentt time ti iinstants t t are uncorrelated. l t d

• Gaussian noise: the distribution at any time instant is Gaussian Gaussian – Gaussian noise can be colored

• White noise  Gaussian noise

PDF

– White noise can be non-Gaussian

• Nonetheless, in communications, it is typically additive white Gaussian noise (AWGN) (AWGN). 49

Ideal Low Low--Pass White Noise • Suppose white noise is applied to an ideal low-pass filter of bandwidth B such that  N0  , | f | B SN ( f )   2 0, otherwise

Power PN = N0B

• By B Wiener-Khinchine Wi Khi hi relation, l i autocorrelation l i ffunction i Rn() = E[n(t)n(t+)] = N0B sinc(2B) (3.1) where sinc(x) = sin(x)/x. x • Samples at Nyquist frequency 2B are uncorrelated Rn() = 0, 0  = k/(2B), k/(2B) k = 1, 1 2, 2 …

50

Bandpass Noise •

Any communication system that uses carrier modulation will typically have a bandpass filter of bandwidth B at the front-end of the receiver.

n(t)

•

Any y noise that enters the receiver will therefore be bandpass p in nature: its spectral magnitude is non-zero only for some band concentrated around the carrier frequency fc (sometimes called narrowband noise).

51

Example • If white noise with PSD of N0/2 is passed through an ideal bandpass p filter,, then the PSD of the noise that enters the receiver is given by  N0  , f  fC  B SN ( f )   2 0, otherwise Power PN = 2N0B

• Autocorrelation function Rn() = 2N0Bsinc(2B)cos(2fc) – which follows from (3.1) by applying l i th the ffrequency-shift hift property of the Fourier transform

g ( t )  G ( ) g ( t )  2 cos  0 t  [ G (   0 )  G (   0 )]

• Samples taken at frequency 2B are still uncorrelated. uncorrelated Rn() = 0,  = k/(2B), k = 1, 2, … 52

Decomposition of Bandpass Noise • Consider bandpass noise within f  fC  B with any PSD ((i.e.,, not necessarilyy white as in the previous p example) p ) • Consider a frequency slice ∆f at frequencies fk and −fk. • For ∆f small: n k ( t )  a k cos( 2 f k t   k ) – θk: a random p phase assumed independent p and uniformly y distributed in the range [0, 2) – ak: a random amplitude. ∆f

-ffk

fk 53

Representation of Bandpass Noise • The complete bandpass noise waveform n(t) can be constructed by summing up such sinusoids over the entire band ii.e., band, e n(t )   nk (t )   ak cos(2 f k t   k ) k

f k  f c  k f

(3.2)

k

• Now, let fk = (fk − fc)+ fc, and using cos(A + B) = cosAcosB − sinAsinB we obtain the canonical form of bandpass noise i

n(t )  nc (t ) cos(2f ct )  ns (t ) sin( 2f ct ) where

nc (t )   ak cos(2 ( f k  f c )t   k ) k

(3.3)

ns (t )   ak sin( 2 ( f k  f c )t   k ) k

– nc(t) and ns(t) are baseband signals, signals termed the in-phase in phase and quadrature component, respectively. 54

Extraction and Generation • nc(t) and ns(t) are fully representative of bandpass noise. – (a) ( ) Given bandpass p noise,, one mayy extract its in-phase p and quadrature components (using LPF of bandwith B). This is extremely useful in analysis of noise in communication receivers. – (b) Given the two components, components one may generate bandpass noise noise. This is useful in computer simulation.

nc(t)

nc(t)

ns(t)

ns(t)

55

Properties of Baseband Noise • If the noise n(t) has zero mean, then nc(t) and ns(t) have zero mean. • If the noise n(t) is Gaussian, then nc(t) and ns(t) are Gaussian. • If the noise n(t) is stationary, then nc(t) and ns(t) are stationary. • If the noise n(t) is Gaussian and its power spectral density S( f ) is symmetric with respect to the central frequency fc, th nc(t) then ( ) and d ns(t) ( ) are statistical t ti ti l independent. i d d t • The components nc(t) and ns(t) have the same variance (= power) as n(t). )

56

Power Spectral Density • Further, each baseband noise waveform will have the same PSD:  S N ( f  f c )  S N ( f  f c ), | f | B Sc ( f )  S s ( f )   otherwise 0,

(3.4)

• This is analogous to g (t )  G ( ) g (t )2 cos 0t  [G (  0 )  G (  0 )] – A rigorous g p proof can be found in A. Papoulis, p , Probability, y, Random Variables, and Stochastic Processes, McGraw-Hill. – The PSD can also be seen from the expressions (3.2) and (3.3) where each of nc(t) and ns(t) consists of a sum of closely spaced base-band sinusoids.

57

Noise Power • For ideally filtered narrowband noise, the PSD of nc(t) Sc(f)=Ss(f) and ns((t)) is therefore g given by y  N 0 , | f | B Sc ( f )  S s ( f )   0, otherwise

(3.5)

• Corollary: The average power in each of the baseband waveforms f nc(t) ( ) and d ns(t) ( ) is i identical id ti l to t the th average power in the bandpass noise waveform n(t). • For ideally filtered narrowband noise noise, the variance of nc(t) and ns(t) is 2N0B each.

PNc = PNs = 2N0B 58

Phasor Representation • We may write bandpass noise in the alternative form: n(t )  nc (t ) cos(2 f c t )  ns (t ) sin(2 f c t )  r (t ) cos[2 f c t   (t )] – –

r (t )  nc (t ) 2  ns (t ) 2  ns ( t )    nc (t ) 

 (t )  tan 1 

: the envelop of the noise : the p phase of the noise

θ( ) ≡ 2 fct + (t) θ(t) ()

59

Distribution of Envelop and Phase • It can be shown that if nc(t) and ns(t) are Gaussiang r(t) ( ) has a Rayleigh y g distributed,, then the magnitude distribution, and the phase (t) is uniformly distributed. • What if a sinusoid Acos(2 fct) is mixed with noise? • Then the magnitude g will have a Rice distribution. • The p proof is deferred to Lecture 11,, where such distributions arise in demodulation of digital signals.

60

Summary • White noise: PSD is constant over an infinite bandwidth. • Gaussian noise: PDF is Gaussian Gaussian. • Bandpass noise – In-phase In phase and quadrature compoments nc(t) and ns(t) are low-pass low pass random processes. – nc(t) and ns(t) have the same PSD. – nc(t) and ns(t) have the same variance as the band-pass noise n(t). – Such properties will be pivotal to the performance analysis of bandpass communication systems.

• The in-phase/quadrature representation and phasor representation p are not onlyy basic to the characterization of bandpass noise itself, but also to the analysis of bandpass communication systems. 61


Lecture 4: Noise Performance of DSB Dr. Cong Ling Department of Electrical and Electronic Engineering

62

Outline • SNR of baseband analog transmission • Revision of AM • SNR of DSB-SC • References – Notes of Communication Systems Systems, Chap Chap. 3 3.1-3.3.2. 1-3 3 2 – Haykin & Moher, Communication Systems, 5th ed., Chap. 6 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 12

63

Noise in Analog Communication Systems • How do various analog modulation schemes perform in the presence of noise? • Which scheme performs best? • How can we measure its performance?

Model of an analog communication system

Noise PSD: BT is the bandwidth, N0/2 is the double-sided noise PSD 64

SNR • We must find a way to quantify (= to measure) the performance of a modulation scheme. p • We use the signal-to-noise ratio (SNR) at the output of the receiver: average power of message signal at the receiver output PS SNRo   average power of noise at the receiver output PN – Normally expressed in decibels (dB) – SNR (dB) = 10 log10(SNR) – This is to manage the wide range of power levels in communication systems – In honour of Alexander Bell – Example: p

dB If x is power, X (dB) = 10 log10(x) If x is amplitude, X ((dB)) = 20 log g10((x))

• ratio of 2  3 dB; 4  6 dB; 10  10dB 65

Transmitted Power • PT: The transmitted power • Limited by: equipment capability capability, battery life life, cost cost, government restrictions, interference with other channels, green communications etc • The higher it is, the more the received power (PS), the higher the SNR • For a fair comparison between different modulation schemes: – PT should be the same for all

• We use the baseband signal to noise ratio SNRbaseband to calibrate the SNR values we obtain

66

A Baseband Communication System • It does not use modulation • It is suitable for transmission over wires • The power it transmits is identical to the message power: PT = P • No attenuation: PS = PT = P • The results can be extended to band band-pass pass systems

67

Output SNR • • • •

Average signal (= message) power P = the area under the triangular curve Assume: Additive, white noise with power spectral density PSD = N0/2 A Average noise i power att th the receiver i PN = area under the straight line = 2W  N0/2 = WN0 SNR at the receiver output: SNRbaseband

PT  N 0W

– Note: Assume no propagation loss

•

Improve the SNR by: – increasing the transmitted power (PT ↑), – restricting the message bandwidth (W ↓), – making the channel/receiver less noisy (N0 ↓). ↓) 68

Revision: AM • General form of an AM signal:

s(t ) AM  [ A  m(t )] cos(2f ct ) – A A: the th amplitude lit d off the th carrier i – fc: the carrier frequency – m(t): the message signal

• Modulation index:



mp A

– mp: the peak amplitude of m(t), i.e., mp = max |m(t)|

69

Signal Recovery

n(t)

Receiver model

1)   1  A  m p : use an envelope l d detector. t t This is the case in almost all commercial AM radio receivers receivers. Simple circuit to make radio receivers cheap. 2) Otherwise: use synchronous detection = product detection = coherent detection Th terms The t detection d t ti and d demodulation d d l ti are used d iinterchangeably. t h bl 70

Synchronous Detection for AM • Multiply the waveform at the receiver with a local carrier of the same frequency (and phase) as the carrier used at the t transmitter: itt 2 cos(2 f c t ) s (t ) AM  [ A  m(t )]2 cos 2 (2 f c t )  [ A  m(t )][1  cos(4 f c t )]  A  m(t ) 

• Use a LPF to recover A + m(t) and finally m(t) • Remark: At the receiver you need a signal perfectly synchronized with the transmitted carrier

71

DSB--SC DSB • Double-sideband suppressed carrier (DSB-SC)

s (t ) DSB  SC  Am(t ) cos(2 f c t ) • Signal recovery: with synchronous detection only • The received noisy signal is x (t )  s (t )  n (t )  s(t )  nc (t ) cos(2f c t )  ns (t ) sin(2f c t )  Am(t ) cos(2f c t )  nc (t ) cos(2f c t )  ns (t ) sin(2f ct )  [ Am A (t )  nc (t )] cos((2f c t )  ns (t ) sin( i (2f c t ) y(t)

2 72

Synchronous Detection for DSBDSB-SC • Multiply with 2cos(2fct): y (t )  2 cos(2 f c t ) x(t )  Am(t )2 cos 2 (2 f c t )  nc (t )2 cos 2 (2 f c t )  ns (t ) sin(4 f c t )  Am(t )[1  cos(4 f c t )]  nc (t )[1  cos(4 f c t )]  ns (t ) sin(4 f c t ) • Use a LPF to keep

~ y  Am(t )  nc (t ) • Signal power at the receiver output:

PS  E{A2m2 (t)}  A2E{m2 (t)}  A2P • Power of the noise nc(t) (recall (3 (3.5), 5) and message bandwidth W): W

PN   N 0df  2 N 0W W

73

Comparison •

SNR at the receiver output:

A2 P SNRo  2 N 0W • To which transmitted power does this correspond? A2 P 2 2 2 PT  E{ A m(t ) cos (2 f c t )}  2 • So

SNRo  •

Comparison with

SNRbaseband  •

PT  SNRDSB  SC N0W

PT  SNRDSB  SC  SNRbaseband N 0W

Conclusion: DSB DSB-SC SC system has the same SNR performance as a baseband system. 74


Lecture 5: Noise performance of SSB and AM Dr. Cong Ling Department of Electrical and Electronic Engineering

75

Outline • Noise in SSB • Noise in standard AM – Coherent detection (of theoretic interest only) – Envelope detection

• References – Notes of Communication Systems, Chap. 3.3.3 3.3.3-3.3.4. 3.3.4. – Haykin & Moher, Communication Systems, 5th ed., Chap. 6 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 12 76

SSB Modulation • Consider single (lower) sideband AM: A A s(t )SSB  m(t ) cos 2f ct  mˆ (t ) sin i 2f ct 2 2 where mˆ (t ) is the Hilbert transform of m(t). • mˆ (t ) is obtained by passing m(t) through a linear filter with transfer function −jsgn(f ). • mˆ (t ) and m(t) have the same power P . • The average power is A2P/4.

77

Noise in SSB • Receiver signal x(t) = s(t) + n(t). • Apply a band-pass band pass filter on the lower sideband sideband. • Still denote by nc(t) the lower-sideband noise (different from the double double-sideband sideband noise in DSB). • Using coherent detection: y (t )  x(t )  2 cos(2 f c t ) A  A    m(t )  nc (t )    m(t )  nc (t )  cos(4 f c t ) 2  2  A    mˆ (t )  ns (t )  sin(4 f c t ) 2 

• After low-pass filtering,

A  y (t )   m(t )  nc (t )  2  78

Noise Power • Noise power for nc(t) = that for band-pass noise = N0W ((halved compared p to DSB)) ((recall ((3.4)) )) SN(f)

N0/2 f -ffc

0

-ffc+W

fc-W W

fc

Lower-sideband noise

SNc(f) N0/2

f -W W

0

W

Baseband noise

79

Output SNR • Signal power A2P/4 • SNR at output

A2 P SNRSSB  4 N 0W • For a baseband system with the same transmitted power A2P/4 A2 P SNRbaseband  4 N 0W • Conclusion: SSB achieves the same SNR performance as DSB-SC (and the baseband model) but only requires half the band-width. 80

Standard AM: Synchronous Detection • Pre-detection signal:

x(t )  [ A  m(t )]cos(2f ct )  n(t )  [ A  m(t )]cos(2f ct )  nc (t ) cos(2f ct )  ns (t ) sin(2f ct )  [ A  m(t )  nc (t )]cos(2f ct )  ns (t ) sin(2f ct ) • Multiply with 2 cos(2fct):

y (t )  [ A  m(t )  nc (t )][1  cos(4 f c t )]  ns (t ) sin(4 f c t ) • LPF

~ y  A  m(t )  nc (t ) 81

Output SNR • Signal power at the receiver output: PS  E{m2(t)} P • Noise power: PN  2 N 0W • SNR at the receiver output: P SNRo   SNRAM 2 N0W

• Transmitted power A2 P A2  P PT    2 2 2

82

Comparison • SNR of a baseband signal with the same transmitted power: p A2  P SNRbaseband 

2 N 0W

• Thus: SNRAM

• Note:

P SNRbaseband  2 A P

P 1 2 A P

• Conclusion: the performance of standard AM with synchronous recovery is worse than that of a baseband system. y 83

Model of AM Radio Receiver

AM radio receiver of the superheterodyne type

Model of AM envelope detector 84

Envelope Detection for Standard AM • Phasor diagram of the signals present at an AM receiver

x(t)

• Envelope

y (t )  envelope of x (t )  [ A  m(t )  nc (t )]2  ns (t ) 2 • Equation is too complicated • Must use limiting cases to put it in a form where noise and message are added 85

Small Noise Case • 1st Approximation: (a) Small Noise Case

n ( t )  [ A  m ( t )] • Then • Then • Thus

ns (t )  [ A  m(t )  nc (t )] y (t )  [ A  m(t )  nc (t )] SNRo 

P  SNRenv 2 N 0W

Identical to the postdetection signal in the case off synchronous h detection!

• And in terms of baseband SNR: SNRenv

P  2 SNRbaseband A P

• Valid for small noise onl only!! 86

Large Noise Case • 2nd Approximation: (b) Large Noise Case

n(t )  [ A  m(t )] • Isolate the small quantity: y2 (t )  [ A  m(t)  nc (t )]2  ns2 (t)

 ( A  m(t))2  nc2 (t )  2( A  m(t ))nc (t )  ns2 (t ) 2  A  m t ( ( )) 2( A  m(t))nc (t )  2 2  [nc (t )  ns (t )]1  2  2  2 2 2 nc (t)  ns (t )   nc (t)  ns (t )

 2[ A  m(t )]nc (t )   y (t )  [n (t )  n (t )]1  2 2 nc (t )  ns (t )   2

2 c

2 s

 2[ A  m(t )]nc (t )    E (t )1  2 En (t (t )   2 n

En (t )  nc2 (t )  ns2 (t ) 87

Large Noise Case: Threshold Effect • •

From the phasor diagram: nc(t) = En(t) cosθn(t) Then:

y(t)  En (t) 1  •

U Use

1 x 1

x for x  1 2

2[ A  m(t)]cosn (t) En (t)

 [ A  m(t )] cos  n (t )   y (t )  En (t )1  En ( t )    En (t )  [ A  m(t )] cos  n (t )

• • •

Noise is multiplicative here! No term proportional to the message! R Result: lt a threshold th h ld effect, ff t as below b l some carrier i power llevell ((very low A), the performance of the detector deteriorates very rapidly.

88

Summary

A: carrier amplitude amplitude, P: power of message signal signal, N0: single-sided PSD of noise noise, W: message bandwidth. 89


Lecture 6: Noise Performance of FM Dr. Cong Ling Department of Electrical and Electronic Engineering

90

Outline • Recap of FM • FM system model in noise • PSD of noise • References – Notes of Communication Systems, Chap. 3.4.1-3.4.2. – Haykin & Moher, Communication Systems, 5th ed., Chap. 6 – Lathi, Lathi Modern Digital and Analog Communication Systems Systems, 3rd ed., Chap. 12

91

Frequency Modulation • Fundamental difference between AM and FM: • AM: message information contained in the signal amplitude ֜ Additive noise: corrupts directly the modulated signal. • FM: message information contained in the signal frequency ֜ the effect of noise on an FM signal is determined by the extent to which it changes the frequency of the modulated signal. • Consequently, FM signals is less affected by noise than AM signals 92

Revision: FM • A carrier waveform s(t) ( ) = A cos[[i((t)] )]

– where i(t): the instantaneous phase angle.

• When

s(t) = A cos(2f t) ֜ i(t) = 2f t we may say that

d 1 d  2f  f  dt 2 dt • Generalisation: instantaneous frequency: 1 d i ( t ) f i (t )  2 dt

93

FM • In FM: the instantaneous frequency of the carrier varies g linearlyy with the message: fi(t) = fc + kf m(t) – where kf is the frequency sensitivity of the modulator.

• Hence (assuming i(0)=0): t

t

i (t )  2  fi ( )d  2 f c t  2 k f  m( )d 0

0

• Modulated signal: • Note:

t  s(t )  A cos 2f c t  2k f  m( )d    0

– (a) The envelope is constant – (b) Signal s(t) is a non-linear function of the message signal m(t).

94

Bandwidth of FM • mp = max|m(t)|: peak message amplitude. • fc − kf mp < instantaneous frequency q y < fc + kf mp • Define: frequency deviation = the deviation of the instantaneous frequency from the carrier frequency: ∆f = kf mp • Define: deviation ratio:   f / W – W: the message bandwidth. – Small β β: FM bandwidth  2x message g bandwidth ((narrow-band FM)) – Large β: FM bandwidth >> 2x message bandwidth (wide-band FM)

• Carson’s rule of thumb: BT = 2W(β+1) = 2(∆f + W) – β 1 ֜ BT ≈ 2∆ff

95

FM Receiver

n(t)

•

• • •

Bandpass filter: removes any signals outside the bandwidth of fc ± BT/2 ֜ the predetection noise at the receiver is bandpass with a b d idth off BT. bandwidth FM signal has a constant envelope ֜ use a limiter to remove any y amplitude p variations Discriminator: a device with output proportional to the deviation in the instantaneous frequency ֜ it recovers the message signal Final baseband low-pass lo pass filter: filter has a bandwidth band idth of W ֜ it passes the message signal and removes out-of-band noise.

96

Linear Argument at High SNR •

FM is nonlinear (modulation & demodulation), meaning superposition doesn’t hold.

•

Nonetheless, it can be shown that for high SNR, noise output and message signal are approximately independent of each other: Output ≈ Message + Noise (i.e., no other nonlinear terms).

• • •

Any (smooth) nonlinear systems are locally linear! This can be justified rigorously by applying Taylor series expansion. Noise does not affect power of the message signal at the output output, and vice versa. ֜ We can compute the signal power for the case without noise, and accept that the result holds for the case with noise too. ֜ We can compute the noise power for the case without message, and accept that the result holds for the case with message too. too

• •

97

Output Signal Power Without Noise • Instantaneous frequency of the input signal: fi  fc  k f m(t)

• Output of discriminator: k f m((t )

• So, output signal power: PS  k 2f P – P : the average power of the message signal

98

Output Signal with Noise • In the presence of additive noise, the real predetection signal p g is t  x(t )  A cos 2 f c t  2 k f  m( )d    0  nc (t ) cos(2 f c t )  ns (t ) sin(2 f c t )

• It can be shown (by linear argument again): For high SNR, noise output is approximately independent of the message signal ֜ In order to calculate the power of output noise, we may assume there is no message ֜ i.e.,, we onlyy have the carrier plus p noise p present:

~ x (t )  A cos(2f c t )  nc (t ) cos(2f c t )  ns (t ) sin(2f c t ) 99

Phase Noise Phasor diagram of the FM carrier and noise signals

ns(t) A

• Instantaneous I t t phase h noise: i i (t)  tan1

nc(t)

ns (t ) A  nc (t )

• For large carrier power (large A): i (t )  tan 1

ns (t ) ns (t )  A A

• Discriminator output = instantaneous frequency: f i (t ) 

1 di (t ) 1 dns (t )  2 dt 2 A dt 100

Discriminator Output • The discriminator output in the presence of both g and noise: signal 1 dn d s (t ) k f m( t )  2A dt • What is the PSD of • Fourier theory:

dns (t ) nd (t )  d dt

x(t )  X ( f ) dx(t ) then  j 2 fX ( f ) dt

if

• Differentiation with respect to time = passing the signal through a system with transfer function of H(f ) = j2 f 101

Noise PSD • It follows from (2.1) that So ( f ) | H ( f ) |2 Si ( f )

•

– Si(f ): PSD of input signal – So(f ): PSD of output signal – H(f )): transfer t f function f ti off the th system t PSD of nd (t )  | j 2 f |2 PSD of ns (t ) Then:

PSD of ns (t )   N 0 within band   PSD of nd (t )  | 2 f |2  N 0

BT   2  f  BT / 2

2

1 dns (t )   1  f2  2 PSD of fi (t )    | 2 f |  N 0  2 N 0 2 A dt   2 A  A 

• Aft After the th LPF, LPF the th PSD off noise i output t t no(t) ( ) is i restricted t i t d iin the band ±W f2 S No ( f )  2 N 0 A

f W

(6 1) (6.1) 102

Power Spectral Densities SNs(f)

(a) Power spectral density of quadrature component ns(t) of narrowband noise n(t). (b) Power spectral density of noise nd(t) at the discriminator output. (c) Power spectral density of noise no(t) at the receiver output.

103


Lecture 7: Pre/de Pre/de--emphasis for FM and Comparison of Analog Systems Dr. Cong Ling Department of Electrical and Electronic Engineering

104

Outline • Derivation of FM output SNR • Pre/de-emphasis to improve SNR • Comparison with AM • References – Notes of Communication Systems, Chap. 3.4.2-3.5. – Haykin & Moher, Communication Systems, 5th ed., Chap. 6 – Lathi, Lathi Modern Digital and Analog Communication Systems Systems, 3rd ed., Chap. 12

105

Noise Power • Average noise power at the receiver output: W

PN   S No ( f )df W

• Thus, Thus from (6 (6.1) 1) PN  

W

W

2 N 0W 3 f2 N 0 df  2 A 3 A2

(7.1)

• Average noise power at the output of a FM receiver 1  carrier power A2 • A ↑ ֜ Noise↓, Noise↓ called the quieting effect

106

Output SNR • Since PS  k 2f P , the output SNR 2 2 PS 3 A k f P SNRO    SNRFM 3 PN 2 N 0W

• Transmitted power of an FM waveform:

• From

A2 PT  2 k f mp PT and   : SNRbaseband  N 0W W 3k 2f P 2 P SNR FM  SNR 3 SNRbaseband   baseband 2 2 W mp

  2 SNRbaseband (could be much higher than AM)

• Valid when the carrier power is large compared with the noise power 107

Threshold effect • The FM detector exhibits a more pronounced threshold p detector. effect than the AM envelope • The threshold point occurs around when signal power is 10 times noise power: A2  10, 2 N 0 BT

BT  2W (   1)

• Below the threshold the FM receiver breaks (i.e., significantly deteriorated). • Can be analyzed by examining the phasor diagram

ns(t) A

nc(t) 108

Qualitative Discussion • As the noise changes randomly, the point P1 wanders around P2 – High SNR: change of angle is small – Low SNR: P1 occasionally sweeps around origin, resulting in changes h off 2 2 in i a short h t ti time

Illustrating impulse like components in   (t)  d(t)/dt produced by changes of 2 in  (t); (a) and (b) are graphs of

 (t) and  (t), (t) respectively. respectively

109

Improve Output SNR

• PSD of the noise at the detector output ‫ ן‬square of frequency. • PSD of a typical message typically rolls off at around 6 dB per decade p • To increase SNRFM: – Use a LPF to cut-off high frequencies at the output • Message is attenuated too, not very satisfactory

– Use pre-emphasis and de-emphasis • Message is unchanged • High frequency components of noise are suppressed 110

Pre--emphasis and DePre De-emphasis

• • • • •

Hpe(f ): used to artificially emphasize the high frequency components of the message prior to modulation, and hence, before noise is introduced. Hde(f ): used to de-emphasize the high frequency components at the receiver, and restore the original PSD of the message signal. In theory, Hpe(f ) ‫ ן‬f , Hde(f ) ‫ ן‬1/f . Thi can improve This i the th output t t SNR by b around d 13 dB. dB Dolby noise reduction uses an analogous pre-emphasis technique to reduce the effects of noise (hissing noise in audiotape recording is also l concentrated t t d on hi high h ffrequency). ) 111

Improvement Factor • Assume an ideal pair of pre/de-emphasis filters Hde ( f ) 1/ Hpe ( f ), )

f W

• PSF of noise at the output of de-emphasis filter f2 2 N 0 H de ( f ) , 2 A

f  BT / 2,

  f2  recall S No ( f )  2 N 0  A  

• Average power of noise with de de-emphasis emphasis PN  

W

W

f2 2 H ( f ) N 0 df de 2 A

• Improvement factor (using (7.1)) 2 N0W 3

3 PN without pre / de - emphasis 2 W 2 I  W 2 3A 2  W 2 f 2 pre / de - emphasis p PN with p ( f ) df W A2 H de ( f ) N0dff 3W f H de d

112

Example Circuits •

(a) Pre-emphasis filter H pe ( f )  1  jf / f 0 f 0  1/ (2 rC ),

•

R  r , 2 frC  1

(b) D De-emphasis h i filt filter

1 H de ( f )  1  jf / f 0 • Improvement I

2W 3 3

W

W

f 2 / (1  f 2 / f 0 2 )df

(W / f 0 )3  3[(W / f 0 )  tan 1 (W / f 0 )]

•

In commercial FM, FM W = 15 kHz kHz, f0 = 2.1 2 1 kHz  I = 22  13 dB (a significant gain) 113

Comparison of Analogue Systems •

Assumptions: – – – –

•

single-tone g modulation,, i.e.: m(t) ( ) = Am cos(2 (  fmt); ); the message bandwidth W = fm; for the AM system, µ = 1; for the FM system, β = 5 (which is what is used in commercial FM transmission, with ∆f = 75 kHz, and W = 15 kHz).

With these assumptions assumptions, we find that the SNR expressions for the various modulation schemes become: SNRDSB  SC  SNRbaseband  SNRSSB 1 SNRAM  SNRbaseband 3 3 75 SNRFM   2 SNRbaseband  SNRbaseband 2 2

without pre/deemphasis

114

Performance of Analog Systems

115

Conclusions • (Full) AM: The SNR performance is 4.8 dB worse than a baseband system, and the transmission bandwidth is BT = 2W . • DSB DSB: Th The SNR performance f iis id identical ti l tto a b baseband b d system, and the transmission bandwidth is BT = 2W. • SSB: The SNR performance is again identical, but the transmission bandwidth is only BT = W. • FM: The SNR performance is 15.7 dB better than a b baseband b d system, t and d th the ttransmission i i b bandwidth d idth iis BT = 2(β + 1)W = 12W (with pre- and de-emphasis the SNR performance is increased by about 13 dB with the same transmission bandwidth). 116


Lecture 8: Digital Representation of Signals Dr. Cong Ling Department of Electrical and Electronic Engineering

117

Outline • Introduction to digital communication • Quantization (A/D) and noise • PCM • Companding • References – Notes of Communication Systems, Chap. 4.1-4.3 – Haykin & Moher Moher, Communication Systems, Systems 5th ed ed., Chap Chap. 7 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 6

118

Block Diagram of Digital Communication

119

Why Digital? • Advantages: – Digital g signals g are more immune to channel noise by y using g channel coding (perfect decoding is possible!) – Repeaters along the transmission path can detect a digital signal and retransmit a new noise noise-free free signal – Digital signals derived from all types of analog sources can be represented using a uniform format – Digital signals are easier to process by using microprocessors and VLSI (e.g., digital signal processors, FPGA) – Digital systems are flexible and allow for implementation of sophisticated functions and control – More and more things are digital…

• For digital communication: analog signals are converted to digital. 120

Sampling • How densely should we sample an analog signal so that p its form accurately? y we can reproduce • A signal the spectrum of which is band-limited to W Hz, can be reconstructed exactly from its samples, if they are taken uniformly at a rate of R ≥ 2W Hz. • Nyquist frequency: fs = 2W Hz

121

Quantization • Quantization is the process of transforming the sample p into a discrete amplitude p taken from a finite set amplitude of possible amplitudes. • The more levels, the better approximation. • Don’t need too many levels (human sense can only detect finite differences). • Quantizers can be of a uniform or nonuniform type.

122

Quantization Noise • Quantization noise: the error between the input signal and p signal g the output

123

Variance of Quantization Noise • ∆: gap between quantizing levels (of a uniform quantizer) • q: Quantization error = a random variable in the range 

   q  2 2

• If ∆ is sufficiently small, small it is reasonable to assume that q is uniformly distributed over this range:

• Noise variance

  1  ,  q fQ (q)    2 2 0, otherwise

1  /2 2 PN  E{e }   q fQ (q)dq   q dq    /2 

2

3  /2

1q   3

  /2

2

1   3 ()3   2      24  12   24 124

SNR • Assume: the encoded symbol has n bits – the maximum number of q quantizing g levels is L = 2n – maximum peak-to-peak dynamic range of the quantizer = 2n∆

• P: power of the message signal • mp = max |m(t)|: maximum absolute value of the message signal • Assume: the message signal fully loads the quantizer: 1 (8.1) m p  2n   2n 1  2 • SNR at the quantizer output: PS 12 P P SNRo   2  2 PN  / 12  125

SNR • • • •

From (8.1)  I dB, In dB



mp 2

n 1



SNRo 

2m p 2

n

12 P 4 m2p 22 n

  2

4m 2p 22 n

3P 2 n  22 mp

(8.2)

 3P  SNRo (dB)  10 log10 (2 )  10 log10  2     mp  2n

 3P   20n log10 2  10 log10  2  m   p  3P   6n  10 log10  2  (dB) m   p • Hence, each extra bit in the encoder adds 6 dB to the output SNR quantizer. of the q • Recognize the tradeoff between SNR and n (i.e., rate, or bandwidth). 126

Example • Sinusoidal message signal: m(t) = Am cos(2 fmt). • Average signal power: Am2 P 2 • Maximum signal value: mp = Am. • Substitute into (8 (8.2): 2): 3 Am2 2 n 3 2 n SNRo  2  2 2 2 Am 2

• In dB SNRo (dB)  6n  1.8 1 8 dB

• Audio CDs: n = 16 ֜ SNR > 90 dB 127

Pulse--Coded Modulation (PCM) Pulse

• Sample the message signal above the Nyquist frequency • Quantize the amplitude of each sample • Encode the discrete amplitudes into a binary codeword isn t modulation in the usual sense; it’s it s a • Caution: PCM isn’t type of Analog-to-Digital Conversion. 128

The PCM Process

129

Problem With Uniform Quantization • Problem: the output SNR is adversely affected by peak to g p power ratio. average • Typically small signal amplitudes occur more often than large signal amplitudes. – The signal does not use the entire range of quantization levels available with equal probabilities. – Small amplitudes are not represented as well as large amplitudes amplitudes, as they are more susceptible to quantization noise.

130

Solution: Nonuniform quantization • Nonuniform quantization uses quantization levels of p g, denser at small signal g amplitudes, p , variable spacing, broader at large amplitudes. uniform if quantized

nonuniform if quantized

131

Companding = Compressing + Expanding • A practical (and equivalent) solution to nonuniform quantization: – Compress the signal first – Quantize it (using a uniform quantizer) – Transmit it – Expand E d it • Companding is the corresponding to pre-emphasis and de-emphasis scheme used for FM. • Predistort a message signal in order to achieve better performance in the presence of noise, and then remove the distortion at the receiver. • The Th exactt SNR gain i obtained bt i d with ith companding di depends d d on the th exactt form of the compression used. proper p companding, p g, the output p SNR can be made insensitive to • With p peak to average power ratio.

132

µ-Law vs. A A--Law

(a) µ-law used in North America and Japan Japan, (b) A-law used in most countries of the world. Typical values in practice (T1/E1): µ = 255, A = 87.6. 133

Applications of PCM & Variants • Speech: – PCM: The voice signal g is sampled p at 8 kHz,, q quantized into 256 levels (8 bits). Thus, a telephone PCM signal requires 64 kbps. • need to reduce bandwidth requirements

– DPCM (differential (diff ti l PCM): PCM) quantize ti the th difference diff between b t consecutive samples; can save 8 to 16 kbps. ADPCM (Adaptive DPCM) can go further down to 32 kbps. – Delta modulation: 1-bit DPCM with oversampling; has even lower symbol rate (e.g., 24 kbps).

• Audio CD: 16 16-bit bit PCM at 44 44.1 1 kHz sampling rate rate. • MPEG audio coding: 16-bit PCM at 48 kHz sampling rate compressed to a rate as low as 16 kbps kbps.

134

Demo • Music playing in Windows Sound Recorder

135

Summary • Digitization of signals requires – Sampling: p g a signal g of bandwidth W is sampled p at the Nyquist yq frequency 2W. – Quantization: the link between analog waveforms and digital representation. representation • SNR (under high-resolution assumption)  3P  SNRo (dB)  6n  10 log l 10  2  (dB) m   p • Companding can improve SNR.

• PCM is a common method of representing audio signals. – In a strict sense, “pulse coded modulation” is in fact a (crude) source coding technique (i (i.e, e method of digitally representing analog information). – There are more advanced source coding (compression) techniques in information theory. 136


Lecture 9: Baseband Digital Transmission Dr. Cong Ling Department of Electrical and Electronic Engineering

137

Outline • Line coding • Performance of baseband digital transmission – Model – Bit error rate

• References – Notes of Communication Systems, Chap. 4.4 – Haykin & Moher, Communication y , 5th ed.,, Chap. p 8 Systems, – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 7 138

Line Coding • The bits of PCM, DPCM etc need to be converted into g some electrical signals. • Line coding encodes the bit stream for transmission through a line, or a cable. • Line coding was used former to the wide spread application of channel coding and modulation techniques. • Nowadays, it is used for communications between the CPU and peripherals, and for short-distance baseband communications, such as the Ethernet. 139

Line Codes Unipolar nonreturn-to-zero (NRZ) signaling (on-off signaling) Polar NRZ signaling Unipolar Return-to-zero (RZ) signaling Bipolar RZ signaling Manchester code 140

Analog and Digital Communications • Different goals between analog and digital communication y systems: – Analog communication systems: to reproduce the transmitted waveform accurately. ֜ Use signal to noise ratio to assess the quality of the system – Digital communication systems: the transmitted symbol to be identified correctly by the receiver ֜ Use the probability of error of the receiver to assess the quality of the system

141

Model of Binary Baseband C Communication i ti System S t T

Lowpass Lowpass filter filter

n(t) n(t)

T

T

• We only consider binary PCM with on-off signaling: 0 → 0 and 1 → A with bit duration Tb. • Assume: – AWGN channel: The channel noise is additive white Gaussian,, with a double-sided PSD of N0/2. – The LPF is an ideal filter with unit gain on [−W, W ]. – The signal passes through the LPF without distortion (approximately). 142

Distribution of Noise • Effect of additive noise on digital transmission: at the y 1 mayy be mistaken for 0,, and vice versa. receiver,, symbol ֜ bit errors • What is the probability of such an error? • After the LPF, the predetection signal is y(t) = s(t) + n(t) – s(t): the binary-valued function (either 0 or A volts) – n(t): additive white Gaussian noise with zero mean and variance W

   N 0 / 2df N 0W 2

W

• Reminder: A sample value N of n(t) is a Gaussian random variable drawn from a probability density function (the normal distribution):  n2  1 pN ( n)  exp   2   N (0,  2 )  2  2 

143

Decision • Y: a sample value of y(t) • If a symbol 0 were transmitted: y(t) = n(t) – Y will have a PDF of N (0, σ2)

• If a symbol 1 were transmitted: y(t) = A + n(t) – Y will have a PDF of N (A, σ2)

• Use as decision threshold T : – if Y < T, – if Y > T,

choose symbol 0 choose symbol 1

144

Errors • Two cases of decision error: – ((i)) a symbol y 0 was transmitted,, but a symbol y 1 was chosen – (ii) a symbol 1 was transmitted, but a symbol 0 was chosen

Probability density functions for binary data transmission in noise: ((a)) symbol y 0 transmitted,, and (b) ( ) symbol y 1 transmitted. Here T = A/2. 145

Case (i) • Probability of (i) occurring = (Probability of an error, given symbol g y 0 was transmitted)) × ((Probabilityy of a 0 to be transmitted in the first place): p(i) = Pe0 × p0 where: – p0: the a priori probability of transmitting a symbol 0 – Pe0: the conditional probability of error error, given that symbol 0 was transmitted:

Pe 0  



T

 n2  1 dn exp  2   2  2 

146

Case (ii) • Probability of (ii) occurring = (Probability of an error, given symbol g y 1 was transmitted)) × ((Probabilityy of a 1 to be transmitted in the first place): p(ii) = Pe1 × p1 where: – p1: the a priori probability of transmitting a symbol 1 – Pe1: the conditional probability of error error, given that symbol 0 was transmitted:

 ( n  A)2  1 dn Pe1   exp  2    2 2   T

147

Total Error Probability • Total error probability: Pe (T )  p(i )  p(ii )  p1 Pe1  (1  p1 ) Pe 0   (n  A) 2   n2  1 1 exp   exp   2  dn  p1   dn  (1  p1 ) T 2  2  2  2    2  T

• Choose T so that Pe(T) is minimum: dPe ((T T) 0 dT

148

Derivation • Leibnitz rule of differentiating an integral with respect parameter: if to a p I ( )  

b( )

a ( )

f ( x;  )dx

• then

b (  ) f ( x;  ) dI ( ) db( ) da ( )  f (b( );  )  f ( a ( );  )   dx a (  ) d d d 

• Therefore  (T  A)2   T2  dPe (T ) 1 1   (1  p1 )  p1 exp  exp  2   0 2 2 dT  2  2    2   T 2  (T  A) 2  p1  exp    2  1  p1 2    T 2  T 2  A2  2TA   A(2T  A)   exp    exp    2 2   2 2    

149

Optimum Threshold p1 A(2T  A) ln  1  p1 2 2

p1  2 ln   A(2T  A)  1  p1 2

p1 p1 A 2 2 2 ln ln   2T  A  T    A 1  p1 A 1  p1 2

• Equi-probable symbols (p1 = p0 = 1 − p1) ֜ T = A/2. • For F equi-probable i b bl symbols, b l it can b be shown h th thatt Pe0 = Pe1. • Probability of total error:

Pe  p(i )  p(ii )  p0 Pe 0  p1 Pe1  Pe 0  Pe1 since p0 = p1 = 1/2, and Pe0 = Pe1.

150

Calculation of Pe • Define a new variable of integration n 1 z   dz d  dn d  dn d  dz d   – When n = A/2, z = A/(2σ). – When n = ∞, z = ∞.

• Then

 1  z 2 /2 Pe 0  e  dz  A /(2 )  2 1   z 2 /2  e dz  A /(2 ) 2

• We may express Pe0 in terms of the Q-function: 1 Q( x)  2

• Then:

 t2   x exp   2  dt 

 A Pe  Q    2 

151

Example • Example 1: A/σ = 7.4 ֜ Pe = 10−4 • ֜ For a transmission rate is 105 bits/sec, bits/sec there will be an error every 0.1 seconds • Example 2: A/σ = 11.2 ֜ Pe = 10−8 • ֜ For a transmission rate is 105 bits/sec, there will be an error everyy 17 mins • A/σ = 7.4: Corresponds p to 20 log g10((7.4)) = 17.4 dB • A/σ = 11.2: Corresponds to 20 log10(11.2) = 21 dB • ֜ Enormous increase in reliabilityy by y a relativelyy small increase in SNR (if that is affordable).

152

Probability y of Bit Error

153

Q-function • Upper bounds and good pp approximations: Q( x) 

1  x2 / 2 e , x0 2 x

– which becomes tighter g for large x, and

1  x2 / 2 Q( x)  e ,x0 2 – which is a better upper bound for small x.

154

Applications to Ethernet and DSL • 100BASE-TX – One of the dominant forms of fast Ethernet,, transmitting g up p to 100 Mbps over twisted pair. – The link is a maximum of 100 meters in length. – Line Li codes: d NRZ  NRZ IInvertt  MLT-3. MLT 3

• Digital subscriber line (DSL) – Broadband over twist pair pair. – Line code 2B1Q achieved bit error rate of 10-7 at 160 kbps. – ADSL and VDSL adopt p discrete multitone modulation ((DMT)) for higher data rates.

155


Lecture 10: Digital Modulation Dr. Cong Ling Department of Electrical and Electronic Engineering

156

Outline • ASK and bit error rate • FSK and bit error rate • PSK and bit error rate • References – Notes of Communication Systems, Chap. 4.5 – Haykin & Moher, Communication Systems, 5th ed., Chap. 9 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 13

157

Digital Modulation • Three Basic Forms of Signaling Binary Information

(a) Amplitude-shift keying (ASK).

(b) Phase-shift keying (PSK).

(c) Frequency Frequency-shift shift keying (FSK).

158

Demodulation

• Coherent (synchronous) demodulation/detection – Use a BPF to reject out-of-band noise – Multiply the incoming waveform with a cosine of the carrier frequency – Use a LPF – Requires q carrier regeneration g ((both frequency q y and p phase synchronization by using a phase-lock loop)

•

Noncoherent demodulation (envelope detection etc.) – Makes no explicit efforts to estimate the phase 159

ASK • Amplitude shift keying (ASK) = on-off keying (OOK) s0((t)) = 0 s1(t) = A cos(2 fct) or s(t) = A(t) cos(2fct), A(t) ‫{ א‬0, A} • Coherent detection

• Assume an ideal band-pass filter with unit gain on [fc−W, fc +W]. For a practical band-pass filter, 2W should be interpreted as the equivalent bandwidth. 160

Coherent Demodulation • Pre-detection signal:

x(t )  s (t )  n(t )  A cos(2 f c t )  nc (t ) cos(2 f c t )  ns (t ) sin(2 f c t )  [ A(t )  nc (t )]cos(2 f c t )  ns (t ) sin(2 f c t ) • After multiplication with 2cos(2 fct):

y(t )  [ A(t )  nc (t )]2cos2 (2 fct )  ns (t )2sin(2 fct )cos(2 fct )  [ A(t )  nc (t )](1  cos(4 fct ))  ns (t )sin(4 fct ) •

After low low-pass pass filtering: y (t )  A(t )  nc (t ) 161

Bit Error Rate • Reminder: The in-phase noise component nc(t) has the g band-pass p noise n(t) () same variance as the original – The received signal is identical to that for baseband digital transmission ~ (t ) – The Th sample l values l off y (t will ill h have PDF PDFs that th t are id identical ti l tto those of the baseband case

• For ASK the statistics of the receiver signal are identical to those of a baseband system • The ep probability obab y o of e error o for o ASK S is s the e sa same e as for o the e baseband case • Assume equiprobable q p transmission of 0s and 1s. • Then the decision threshold must be A/2 and the probability of error is given by:  A Pe, ASK  Q   2   

162

PSK • Phase shift keying (PSK) s(t) = A(t) cos(2 fct), A(t) ‫{ א‬−A, A} • Use U coherent h td detection t ti again, i tto eventually t ll gett th the detection signal: y (t )  A(t )  nc (t ) • Probability density functions for PSK for equiprobable 0s and 1s in noise (use threshold 0 for detection): – (a): symbol 0 transmitted – ((b): ) symbol y 1 transmitted

163

Analysis • Conditional error probabilities: Pe0  



0

 (n  A)2  1 exp   dn 2  2  2 

 (n  A)2  1 Pe1   exp   dn 2   2  2  0

0 n  A • In the first set n  n  A  dn  dn and when n  0, 1 Pe 0   2

 n~ 2  ~ A exp  2 2 dn 

• In the second set n  (n  A)  n  A  dn  dn when n  0, 0 n  A, and when n   , n   : 1 Pe1   2

 n 2  1  exp   2 2 (1)dn   2 A

 n 2  A exp   2 2 dn 

164

Bit Error Rate • So:

 n2  1 Pe0  Pe1  Pe,PSK   exp  2 dn A  2  2  

• Change variable of integration to z ≡ n/σ ֜ dn = σdz and when n = A, z = A/σ. Then: 1  z2 / 2  A Pe,PSK  d  Q  dz Ae  2    • Remember that 1 Q x   2





x

exp( t 2 / 2)dt

165

FSK • Frequency Shift Keying (FSK) s0((t)) = A cos(2 (  f0t), ), if symbol y 0 is transmitted s1(t) = A cos(2 f1t), if symbol 1 is transmitted • Symbol recovery: – Use two sets of coherent detectors, one operating at a frequency f0 and the other at f1.

Coherent FSK demodulation. demodulation The two BPF’s are nonoverlapping in frequency spectrum

y

166

Output • Each branch = an ASK detector i  A + noise LPF output on each branch =  noise

if symbol b l presentt if symbol not present

• n0(t): the noise output of the top branch • n1(t): the noise output of the bottom branch • Each of these noise terms has identical statistics to nc(t). (t) • Output if a symbol 1 were transmitted y = y1((t)) = A + [[n1((t)) − n0((t)] )] • Output if a symbol 0 were transmitted y = y0((t)) = − A + [n1((t)) − n0((t)] ) 167

Bit Error Rate for FSK • Set detection threshold to 0 • Difference from PSK: the noise term is now n1(t) − n0(t). • The noises in the two channels are independent because their spectra are non-overlapping. – the p proof is done in the p problem sheet. – the variances add. – the noise variance has doubled!

• Replace σ2 in (172) by 2σ2 (or σ by 2 σ ) Pe ,FSK

 A   Q   2  168

The Sum of Two R.V. • Noise is the sum or difference of two independent zero mean random variables: – x1: a random variable with variance σ12 – x2: a random variables with variance σ22

• What is the variance of y ≡ x1 ± x2? • By definition  y2  E{ y 2 }  E{ y}2  E{( x1  x2 ) 2 }  E{x12  2 x1 x2  x22 }  E{x12 }  E{x1 x2 }  E{x22 }

• For independent variables: E{x1x2} = E{x1}E{x2} • For zero-mean random variables: E{x1} = E{x2} = 0 ֜ E{x1x2} = 0 • So  y2  E{x12 }  E{x22 }   12   22 169

Comparison of Three Schemes

170

Comment • To achieve the same error probability (fixed Pe): • PSK can be reduced by 6 dB compared with a baseband or ASK system (a factor of 2 reduction in amplitude) • FSK can be reduced by 3 dB compared with a baseband or ASK (a factor of 2 reduction in amplitude) • Caution: The comparison is based on peak SNR. In terms of average SNR, PSK only has a 3 dB improvement over ASK, and FSK has the same performance as ASK

171

Q-function

172

Examples • Consider binary PSK modulation. Assume the carrier p A = 1 v,, and noise standard deviation  = 1/3. amplitude Determine the bit error probability. – Answer: Pe = 1.03  10-3.

• Now suppose the bit error probability is 10-5. Determine the value of A/. – Answer: A/ = 4.3.

173

Application: GMSK • Gaussian minimum shift keying (GMSK), a special form of preceded by y Gaussian filtering, g, is used in GSM FSK p (Global Systems for Mobile Communications), a leading cellular phone standard in the world. – Also known as digital FM, built on some of FM-related advantages of AMPS, g analog g system y the first-generation (30 KHz bandwidth). – Binary data are passed through a Gaussian filter to satisfy stringent requirements of out-of-band radiation. – Minimum Shift Keying: its spacing between the two frequencies of FSK is minimum in a certain sense (see problem sheet). – GMSK is allocated bandwidth of 200 kHz, shared among 32 users. This provides a (30/200)x32 (30/200)x32=4 4.8 8 times improvement over AMPS AMPS. 174


Lecture 11: Noncoherent Demodulation Dr. Cong Ling Department of Electrical and Electronic Engineering

175

Outline • Noncoherent demodulation of ASK • Noncoherent demodulation of FSK • Differential demodulation of DPSK • References – Haykin & Moher, Communication Systems, 5th ed., Chap. 9 – Lathi, Modern Digital and Analog Communication Systems, 3 d ed., 3rd d Ch Chap. 13

176

Noncoherent Demodulation • Coherent demodulation assumes perfect synchronization. – Needs a p phase lock loop. p

• However, accurate phase synchronization may be difficult in a dynamic channel. – Phase synchronization error is due to varying propagation delays, frequency drift, instability of the local oscillator, effects of strong noise ... – Performance of coherent detection will degrade severely.

• When e the e ca carrier e p phase ase is su unknown, o ,o one e must us rely eyo on nono coherent detection. – No provision is made for carrier phase recovery.

• The phase  is assumed to be uniformly distributed on [0, 2]. • Circuitry is simpler, but analysis is more difficult! 177

Noncoherent Demodulation of ASK Signal plus noise in

Bandpass filter

Envelope detector

Threshold device

Decision

t = nTb

• Output of the BPF y(t) = n(t) when 0 is sent y(t) = n(t) + A cos(2 fct) when 1 is sent • Recall n(t )  nc (t ) cos(2 f c t )  ns (t ) sin(2 f c t ) • Envelope p R  nc2 (t )  ns2 (t )

when 0 is sent

R  ( A  nc (t )) 2  ns2 (t )

when h 1 iis sentt 178

Distribution of the Envelope • When symbol 0 is sent, the envelope (that of the bandpass y g distribution noise alone)) has Rayleigh

f (r) 

r



2

r2 /(2 2 )

e

, r 0

• When symbol 1 is sent, the envelope (that of a signal + bandpass p noise)) has Rician distribution f (r) 

r



2

e

( r 2  A2 ) /( 2 2 )

 Ar  I 0  2 , r  0  

• The first case dominates the error p probability y when A/σ >> 1.

179

Rayleigh Distribution • Define a random variable R  X  Y where X and Y are independent p Gaussian with zero mean and variance  2 • R has Rayleigh distribution: 2

f R (r ) 

r

2

e

 r 2 / ( 2 2 )

2

,

r0

Normalized Rayleigh distribution

v = r/ fV(v) =  fR(r)

• Proving it requires change into polar coordinates: R  X 2 Y2 ,

  tan t 1

Y X 180

Derivation • Consider a small area dxdy = rdrd. One has f R , (r ,  )drd  f X ,Y ( x, y )dxdy  f X ,Y ( x, y )rdrd 

1 2

• Hence f R , ( r ,  ) 

e 2

r 2



e 2

x2  y 2



2 2

rdrd

x2  y 2 2 2

r



2

e 2



r2 2 2

• pdf of R: f R (r )  

2

0

• pdf of 



f R , (r ,  )d 

f  ( )   f R , (r ,  )dr  0

r



e 2



r2 2 2

,

r0

1 ,   [0, 2 ] 2

181

Rician Distribution • If X has nonzero mean A, R has Rician distribution: (r2  A2 )/(2 2 )

fR (r)   2 e r

where

I 0 ( x) 

1 2



2

0

A I0 (Ar 2 ),

r 0

e x cos d

is the modified zero-order Bessel function of the first kind. Normalized Rician distribution

v = r/ a = A/ fV(v) =  fR(r)

182

Derivation • Similarly,

f R , (r ,  )drd  f X ,Y ( x, y )rdrd  

1 2

e 2

1 2

e

2





( x  A )2  y 2 2 2

r 2  A2  2 Ar cos

• Hence f R , ( r ,  ) 

r 2

e 2

rdrd



2 2

rdrd

r 2  A2  2 Ar cos 2 2

Bessel function

• pdf of R: f R (r)  

2

0

f R , ( r,  )d 

r



e 2



r 2  A2 2 2

1 2



2

0

Ar cos 

e

2

d ,

r0 183

Error Probability • Let the threshold be A/2 for simplicity. • The error p probability y is dominated by y symbol y 0,, and is given by 1  r  r 2 /(2 2 ) Pe   e dr 2 A /2 2 

• The final expression Pe, ASK ,noncoherent

1  A2 /(8 2 )  e 2

• Cf. Cf coherent demodulation Pe, ASK ,Coherent

 A  Q  2

 1  A2 /(8 2 )  e  2

• Noncoherent demodulation results in some performance degradation. g Yet,, for a large g SNR,, the p performances of coherent and noncoherent demodulation are similar. 184

Noncoherent Demodulation of FSK

Bandpass filter centered at f1

R1 If R1 > R2, choose 0 If R1 < R2, choose 1

Bandpass filter centered at f2

R2

185

Distribution of Envelope • When a symbol 1 is sent, outputs of the BPFs y1((t)) = n1((t)) y2(t) = n2(t) + A cos(2 f2t) • Again, the first branch has Rayleigh distribution f R1 (r1 ) 

2

2

 r1 /(2 ) , r1  0 2 e  r1

• while the second has Rice distribution f R2 (r2 ) 

r2



2

e

 ( r22  A2 )/(2 2 )

I 0 ( Ar22 ), )

r2  0

• Note the envelopes p R1 and R2 are statisticallyy independent. p

186

Error Probability • Error occurs if Rice < Rayleigh Pe  P( R2  R1 ) 











0



0

0

 e 1 2



r2

2 r2 

e

 ( r2 2  A2 )/(2 2 )

r2

 (2 r2 2  A2 )/(2 2 )



2

e

r1

I0 (  2 )  2 e

I0 (  2 )

e 2

 ( r2 2  A2 )/( )/(2 2 )

r2

Ar2

Ar2



r2

 A2 /(4 2 )





0

x



2

e

 r12 /(2 2 )

2

dr1dr2

2

 r1 /(2 /(  ) dr d 1dr d2 2 e  r1

I 0 ( Ar22 )dr2

 ( x 2  2 )/(2 2 )

I 0 (  x2 )dx

x  2r2 ,   A / 2

• Observe the integrand is a Rician density 1  A2 /(4 2 ) Pe , FSK ,noncoherent  e 2 • Cf. coherent demodulation Pe , FSK ,Coherent

 A  1  A 2 / ( 4 2 )  Q  2e  2  187

DPSK: Differential PSK • It is impossible to demodulate PSK with an envelop g have the same frequency q y and detector,, since PSK signals amplitude. • We can demodulate PSK differentially, where phase reference is provided by a delayed version of the signal in the previous interval. • Differential encoding is essential: bn = bn−1  an, where an, bn ‫ א‬1. bn

input

an

DPSK signal

bn-1 Delay Tb

cos(ct)

188

Differential Demodulation signal plus noise

Bandpass filter

Lowpass filter

Threshold device

Decision D i i on an

Delay Tb

• Computing the error probability is cumbersome but fortunately the final expression is simple: 1 A2 /(2 2 ) Pe,DPSK  e 2 – The derivation can be found in Haykin, Communication Systems, 4th ed., Chap. 6. – Performance is degraded in comparison to coherent PSK.

• Cf. coherent demodulation Pe ,PSK

 A  1  A2 /( 2 2 )  Q   e   2 189

Illustration of DPSK Information symbols {an}

1

-1

-1

1

-1

-1

1

1

{bn-1}

1

1

-1

1

1

-1

1

1

Differentially encoded sequence {bn}

1

1

-1

1

1

-1

1

1

1

Transmitted phase (radians)

0

0



0

0



0

0

0

Output of lowpass filter (polarity)

+

–

–

+

–

–

+

+

ec s o Decision

1

-1

-1

1

-1

-1

1

1

Note: The symbol 1 is inserted at the beginning of the differentially encoded sequence is the reference symbol. 190

Summary and Comparison Scheme

Bit Error Rate

Coherent ASK

Q(A/2)

Coherent FSK

Q(A/2)

Coherent PSK

Q(A/)

Noncoherent ASK

½ exp( exp(-A A2/82)

Noncoherent FSK

½ exp(-A p( 2/42)

DPSK

½ exp(-A2/22)

Noncoherent ASK, FSK Coherent ASK, FSK

Coherent PSK

Caution: ASK and FSK have the same bit error rate if measured by average SNR. Average SNR, dB 191

Conclusions • Non-coherent demodulation retains the hierarchy of p performance. • Non-coherent Non coherent demodulation has error performance slightly worse than coherent demodulation, but approaches coherent performance at high SNR. • Non-coherent demodulators are considerably easier to build.

192

Application: DPSK • WLAN standard IEEE 802.11b • Bluetooth2 • Digital audio broadcast (DAB): DPSK + OFDM (orthogonal frequency division multiplexing) • Inmarsat (International Maritime Satellite Organization): now a London-based mobile satellite company

193


Lecture 12: Entropy and Source Coding Dr. Cong Ling Department of Electrical and Electronic Engineering

194

Outline • What is information theory? • Entropy – Definition of information – Entropy of a source

• Source coding – Source coding theorem – Huffman coding

• References – N Notes t off Communication C i ti S Systems, t Ch Chap. 5 5.1-5.4 154 – Haykin & Moher, Communication Systems, 5th ed., Chap. 10 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 15 195

Model of a Digital Communication System

196

What is Information Theory • C. E. Shannon, ”A mathematical theory of , Bell System y Technical communication,” Journal, 1948. • Two fundamental questions in communication theory: – What is the ultimate limit on data compression? – What is the ultimate transmission rate off reliable communication over noisy channels?

• Shannon showed reliable (i (i.e., e error-free) error free) communication is possible for all rates below channel capacity (using channel coding). • Any source can be represented in bits at any rate above entropy (using source coding). – Rise of digital information technology 197

What is Information? • Information: any new knowledge about something • How can we measure it? • Messages containing knowledge of a high probability of occurrence ֜ Not very informative • Messages containing knowledge of low probability of occurrence ֜ More informative • A small change in the probability of a certain output should not change the information delivered by that output by a large amount.

198

Definition • The amount of information of a symbol s with probability p y p:

1 I (s)  log p • Properties: – p = 1 ֜ I(s) ( ) = 0: a symbol y that is certain to occur contains no information. – 0 ≤ p ≤ 1 ֜ 0 ≤ I(s) ≤ ∞: the information measure is monotonic and non-negative. – p = p1  p2 ֜ I(s) = I(p1) + I(p2): information is additive for statistically independent events. 199

Example • In communications, log base 2 is commonly used, g in unit bit resulting • Example: Two symbols with equal probability p1 = p2 = 0.5 • Each symbol represents I(s) = log2(1/0.5) = 1 bit of information. • In summary:

I ( s )  log 2

1 bits p

• Reminder: From logs of base 10 to logs of base 2: log2 x 

log10 x  3.32 log10 x log10 2 200

Discrete Memoryless Source • Suppose we have an information source emitting a q of symbols y from a finite alphabet: p sequence S = {s1, s2, . . . , sK} • Discrete memoryless source: The successive symbols are statistically independent (i.i.d.) • Assume that each symbol has a probability of occurrence K

pk , k  1,..., K , such that  pk  1 k 1

201

Source Entropy • If symbol sk has occurred, then, by definition, we have received

I (sk )  log2 bits of information.

1   log2 pk pk

• The expected value of I(sk) over the source alphabet K

K

k 1

k 1

E{I ( sk )}   pk I ( sk )    pk log l 2 pk • Source entropy: the average amount of information per source symbol: K

H ( S )    pk log2 pk k 1

• Units: bits / symbol.

202

Meaning of Entropy • What information about a source does its entropy give us? • It is the amount of uncertainty before we receive itit. • It tells us how many bits of information per symbol we expect to get on the average. • Relation with thermodynamic entropy – In thermodynamics: entropy measures disorder and randomness; – In information theory: entropy measures uncertainty. – Some argue that to obtain 1 bit information, the minimum energy needed is 10-23 Joules/Kelvin degree (extremely cheap!). – This may have implications on green computation. computation

203

Example: Entropy of a Binary Source • s1: occurs with probability p • s0: occurs with probability 1 − p. • Entropy of the source: H ( S )  (1  p ) llog 2 (1  p )  p log l 2 p  H ( p)

H(p)

– H(p) is referred to as the entropy function.

Maximum uncertainty when p = 1/2. p 204

Example: A Three-Symbol Alphabet – A: occurs with probability 0.7 – B: occurs with probability 0.2 – C: occurs with probability 0.1

• Source entropy: H ((SS )  0.7 log2 (0.7)  0.2 log2 (0.2)  0.1 log2 (0.1)  0.7  0.515  0.2  2.322  0.1  3.322  1.157 bits/symbol

• How can we encode these symbols in order to transmit them? • We need 2 bits/symbol if encoded as

A = 00, B = 01, C = 10 • Entropy prediction: the average amount of information is only 1.157 bits per symbol. • We W are wasting ti bits! bit ! 205

Source Coding Theory • What is the minimum number of bits that are required to particular symbol? y transmit a p • How can we encode symbols so that we achieve (or at least come arbitrarily close to) this limit? • Source encoding: concerned with minimizing the actual number of source bits that are transmitted to the user • Channel encoding: concerned with introducing redundant bits to enable the receiver to detect and possibly correct errors that are introduced by the channel. 206

Average Codeword Length • • • •

lk: the number of bits used to code the k-th symbol K: total number of symbols pk: the probability of occurrence of symbol k Define the average codeword length: K

L   p k lk k 1

• It represents the average number of bits per symbol in the source alphabet. alphabet • An idea to reduce average codeword length: symbols that occur often should be encoded with short codewords;; symbols that occur rarely may be encoded using the long codewords. 207

Minimum Codeword Length • What is the minimum codeword length for a particular p of source symbols? y alphabet • In a system with 2 symbols that are equally likely: – Probability of each symbol to occur: p = 1/2 – Best one can do: encode each with 1 bit only: 0 or 1

• In a system with n symbols that are equally likely: Probability of each symbol to occur: p = 1/n • One needs L = log2 n = log2 1/p = − log2 p bits to represent the symbols.

208

The General Case • • • • •

S = {s1, . . . , sK}: an alphabet pk: the p probability y of symbol y sk to occur N : the number of symbols generated We expect Npk occurrences of sk Assume: – The source is memoryless – allll symbols b l are iindependent d d t – the probability of occurrence of a typical sequence SN is:

p ( S N )  p1Np1  p2 Np2  ...  pK NpK

• All such typical sequences of N symbols are equally lik l likely • All other compositions are extremely unlikely to occur as N → ∞ (Shannon, (Shannon 1948) 1948). 209

Source Coding Theorem • The number of bits required to represent a typical q SN is sequence LN  log 2

1   log 2 ( p1Np1  p2 Np2  ...  pK NpK ) p(S N )

  Np N 1 log l 2 p1  Np N 2 log l 2 p2  ...  Np N K log l 2 pK K

  N  pk log 2 pk  NH ( S ) k 1

• Average length for one symbol: L  LN  H ( S ) bits / symbol N

• Given a discrete memoryless source of entropy H(S), the average g codeword length g for any y source coding g scheme is bounded by H(S):

L  H (S ) 210

Huffman Coding • • • • •

How one can design an efficient source coding algorithm? Use Huffman Coding (among other algorithms) It yields the shortest average codeword length. Basic idea: choose codeword lengths so that moreprobable sequences have shorter codewords Huffman Code construction – Sort the source symbols in order of decreasing probability. – Take the two smallest p( p(xi) and assign g each a different bit ((i.e., 0 or 1). Then merge into a single symbol. – Repeat until only one symbol remains.

• •

It s very easy to implement this algorithm in a It’s microprocessor or computer. Used in JPEG,, MP3… 211

Example

Read diagram backwards for codewords

• The average g codeword length: g L  (2  0.4)  (2  0.2)  (2  0.2)  (3  0.1)  (3  0.1)  2.2

• More than the entropy H(S) = 2.12 bits per symbol. ֜ Room for further improvement. 212

Application: File Compression • A drawback of Huffman coding is that it requires g of a p probabilistic model,, which is not always y knowledge available a prior. • Lempel-Ziv coding overcomes this practical limitation and has become the standard algorithm for file compression. – compress, gzip, GIF, TIFF, PDF modem… PDF, modem – A text file can typically be compressed to half of its original size.

213

Summary • The entropy of a discrete memoryless information source K

H ( S )    pk log2 pk k 1

• Entropy function (entropy of a binary memoryless source) H ( S )  (1  p ) log 2 (1  p )  p log 2 p  H ( p )

• Source coding theorem: The minimum average codeword length for any source coding scheme is H(S) for a discrete memoryless source. • Huffman coding: An efficient source coding algorithm.

214


Lecture 13: Channel Capacity Dr. Cong Ling Department of Electrical and Electronic Engineering

215

Outline • Channel capacity – Channel coding theorem – Shannon formula

• Comparison with practical systems p y • References – Notes of Communication Systems, Chap. 5.5-5.6 – Haykin & Moher, Communication Systems, 5th ed., Chap. 10 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 15

216

Channel Coding Theorem • Shannon, “A mathematical theory of communication,” 1948 • For any channel, there exists an information capacity C (whose calculation is beyond the scope of this course). • If the transmission rate R ≤ C, then there exists a coding scheme such that the output of the source can be transmitted over a noisy channel with an arbitrarily small probability of error. Conversely, it is not possible ibl to t transmit t it messages without ith t error if R > C. • Important implication: – The basic limitation due to noise in a communication channel is not on the reliability of communication, but rather, on the speed of communication. i ti 217

Shannon Capacity Formula • For an additive white Gaussian noise (AWGN) channel, p y is the channel capacity  P  C  B log 2 1  SNR   B log 2  1   N B 0  

bps

– B: the bandwidth of the channel – P: the average signal power at the receiver – N0: the single-sided PSD of noise

• Important implication: We can communicate error free up to t C bits bit per second. d • How can we achieve this rate? – D Design i power error correcting ti codes d tto correctt as many errors as possible (the next two lectures). – Use the ideal modulation scheme that does not lose information in the detection process. 218

Shannon Limit • Tradeoff between power and bandwidth: to get the same g capacity p y C, one mayy target – Increase power, decrease bandwidth; – Decrease power, increase bandwidth.

Unachievable region

• What’s the minimum power to send 1 bps (i.e., C = 1 bps)?

Achievable region

1/B B

P 21/ B  1 1/ B  B(2  1)  N0 1/ B

Capacity boundary

P 21/ B ln 2( B 2 )  lim  lim  ln 2 2 B  N B  B 0

P/N0 (dB)

 0.693 0 693  1.6 1 6 dB

• This is the ultimate limit of green communications. – For more information, see http://www.greentouch.org 219

Example • A voice-grade channel of the telephone network has a bandwidth of 3.4 kHz. – (a) Calculate the capacity for a SNR of 30 dB. – (b) Calculate the SNR required to support a rate of 4800 bps.

• Answer: • (a) 30 dB  SNR = 1000 C  B log 2 (1  SNR)  3.4  log g 2 ((1  1000))  33.9 kbps

• (b)

SNR  2C / B  1  24.8/3.4  1  1.66 1 66  2.2 2 2 dB 220

General Model of a Modulated Communication System

221

Input/Output SNR • The maximum rate at which information may arrive at the receiver is

Cin  B log l 2 (1  SNR S in )

– SNRin: the predetection signal-to-noise ratio at the input to the demodulator demodulator.

• The maximum rate at which information can leave the receiver is

Co  W log2 (1  SNRo )

– SNRo: the SNR at the output of the postdetection filter.

• For the ideal modulation scheme: Cin = Co ֜ 1 + SNRo = [1 + SNRin]B/W • For high SNR, SNR SNRo  SNRinB/W • It seems that spreading p g the bandwidth would make the output SNR increase exponentially (not true). 222

A More Elaborate Analysis • If the channel noise has a double-sided white PSD of N0/2 ֜ the average g noise p power at the demodulator will be N0B. • If the signal power is P : SNRini  P  SNRbaseband N 0W

P W P  N 0 B B N 0W

: the baseband SNR • Increasing B will reduce SNRin, and thus  SNRbaseband  W logg 2 ((1  SNRo )  B logg 2 1   B /W  

• Output SNR of an ideal communication system: B /W

SNRbaseband   SNRo   1  1  B /W    e SNRbaseband as B / W  

x

 1 lim 1    e x   x 223

SNR vs. Bandwidth

SNRo (d dB)

B/W=12

B/W 2 B/W=2

B/W=1

SNRbaseband (dB) 224

Implications to Analog Systems • A bandwidth spreading ratio of B/W=1 corresponds to both SSB and baseband. • A bandwidth spreading ratio of B/W=2 corresponds to DSB and AM (the ideal system would 2 SNRo  SNRbase /4 have )). • A bandwidth spreading ratio of B/W=12 corresponds to g and AM ((the ideal system y commercial FM broadcasting would have SNR o  e SNR ). • SSB and baseband systems provide noise performance id ti l tto th identical the id ideall ((which hi h iis ttrivial). i i l) • DSB has a worse performance because of its additional bandwidth requirements. requirements • FM systems only come close to the ideal system near threshold. base

225

Noise Performance of Analog Communication Systems

226

Discussion • Known analogue communication systems do not achieve performance in g general. the ideal p • Something must be missing with analogue communications. • Similarly, it can be shown that simple digital modulation schemes cannot achieve capacity too. • One must resort to digital communications and coding to approach the capacity. • The idea of tradeoff between bandwidth and SNR is useful in spread-spectrum communications (CDMA, 3G mobile communications) and ultra wideband wideband.

227


Lecture 14: Block Codes Dr. Cong Ling Department of Electrical and Electronic Engineering

228

Outline • Introduction • Block codes • Error detection and correction • Generator matrix and parity-check parity check matrix • References – Haykin & Moher, Communication Systems, 5th ed., Chap. 10 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed ed., Chap Chap. 16 229

Taxonomy of Coding Error Correction Coding = FEC - no feedback channel

Cryptography (Ciphering)

Error Detection Coding - used in ARQ as in TCP/IP - feedback channel - retransmissions

Source Coding

Error Control Coding

- Makes bits equally probable

- Strives to utilize channel capacity by adding extra bits

Compression Coding

- Redundancyy removal: - Destructive (jpeg, mpeg) - Non-destructive (zip)

Line Coding - for baseband communications

FEC: Forward Error Correction ARQ: Automatic Repeat Request

230

Block vs. Convolutional Codes • Block codes k bits

(n,k)

n bits

k input bits

encoder

– Output of n bits depends only on the k input bits n output bits

• Convolutional codes – Each source bit influences n(L+1) output bits – L is the memory length – Like Lik convolution l ti iin a lilinear filt filter

i input bit bi

n(L+1) output bits

k = 11, n = 22, L = 2 231

Noise and Errors • Noise can corrupt the information that we wish to transmit. • Generally corruption of a signal is a bad thing that should be avoided, if possible. • Different systems will generally require different levels of protection against errors due to noise. • Consequently, a number of different techniques have been developed to detect and correct different types and numbers of errors.

232

Channel Model • We can measure the effect of noise in different ways. The most common is to specify an error probability, probability p. p Consider the case of a Binary Symmetric Channel with noise. 1

1-p p

0

1 p

1-p

0

• All examples considered here will be for error probabilities that are symmetric, stationary and statisticallyy independent. p 233

Simple Error Checks • •

If the error probability is small and the information is fairly fault tolerant, it is possible to use simple methods to detect errors. Repetition – Repeating each bit in the message – If the two symbols in an adjacent pair are different, it is likely that an error has occurred. – However, this is not very efficient (efficiency is halved). – Repetition provides a means for error checking, but not for error correction.

•

P it bit – Use Parity U off a ‘parity ‘ it bit’ att the th end d off the th message – A parity bit is a single bit that corresponds to the sum of the other message bits (modulo 2). – This allows any odd number of errors to be detected, but not even numbers. – As with repetition, p , this technique q only y allows error checking, g, not error correction. – It is more efficient than simple repetition.

234

Block Codes • An important class of codes that can detect and correct some errors are block codes • The first error-correcting block code was devised by Hamming around the same time as Shannon was working on the foundation of information theory – Hamming codes are a particular class of linear block code

• Block codes – Encode a series of symbols from the source, a ‘block’, into a longer string: codeword or code block – Errors can be detected as the received coded block will not be one of the recognized, valid coded blocks – Error correction: To “decode” and associate a corrupted block to a valid coded block by its proximity (as measured by the “Hamming distance”) d s a ce ) 235

Binary Fields and Vectors • • •

We need to discuss some of mathematics that will be needed. Fortunately, we have restricted things to binary sources, so the mathematics is relatively simple. The binary alphabet A={0,1} is properly referred to as a Galois field with two elements elements, denoted GF(2). – Addition (XOR) 0 + 0 = 0, 0 + 1 = 1 + 0 = 1, 1 + 1 = 0 – Multiplication (AND) 0‫ڄ‬1 = 0‫ڄ‬0 = 1‫ڄ‬0 = 0, 1‫ڄ‬1 = 1 – This is also referred to as Boolean arithmetic in Digital Electronics or modulo-2 arithmetic.

•

A message is built up from a number of binary fields, and forms a bi binary vector, t rather th th than a llarger bi binary number. b – Hence,

101 ≠ 5

101={1}{0}{1}

236

Example • i. ii ii. iii.

Calculate the following examples of binary field arithmetic: 01001 + 01110 10010  01110 (1111+0011)  0011

• i i. ii. iii iii.

Answers 00111 00010 1100  0011 = 0000

237

Hamming Distance •

•

Hamming Weight – The Hamming weight of a binary vector, a (written as wH(a)), is the number of non non-zero zero elements that it contains contains. – Hence, g weight g of 5. 001110011 has a Hamming 000000000 has a Hamming weight of 0. Hamming Distance – The Hamming Distance between two binary vectors, a and b, is written dH(a,b) , and is equal to the Hamming weight of their ((Boolean)) sum. dH(a,b) = wH(a+b) – Hence, 01110011 and 10001011 have a Hamming distance of dH = wH (01110011+10001011) = wH (11111000) = 5

238

Linear Block Codes • A binary linear block code that takes block of k bits of source data and encodes them using n bits, is referred to as a (n, k) binary linear block code. – The ratio between the number of source bits and the number of bits used in the code code, R=k/n, k/n is referred to as the code rate.

• The most important feature of a linear block code – Linearity: the Boolean sum of any codewords must be another codeword. – This means that the set of code words forms a vector space, within ithi which hi h mathematical th ti l operations ti can b be d defined fi d and d performed.

239

Generator Matrix •

•

To construct a linear block code we define a matrix, the generator matrix G, that converts blocks of source symbols b l into i t llonger bl blocks k corresponding di tto code d words. d G is a k×n matrix (k rows, n columns), that takes a source block u (a binary vector of length k), k) to a code word x (a binary vector of length n), x=u‫ڄ‬G

u

x

240

Systematic Codes k information bits

(n-k) parity-check bits

codeword

• Systematic codes: The first k bits will be the original source block. • From linear algebra, a k×n matrix of linearly independent y be written into the systematic y form: rows can always G = [Ik ×k ⋮ Pk ×(n −k )] where Ik ×k is the k×k identityy matrix,, and Pk×(n−k) is a k×(n−k) matrix of parity bits. y code. • Everyy linear code is equivalent to a systematic 241

Example Given the generator matrix for a (7,4) systematic code

1 0 G 0  0

0 0 0 0 1 1 1 0 0 1 0 1 0 1 0 1 1 0  0 0 1 1 1 1

Calculate the code words for the following source blocks: i. 1010 [Answer: x = 1010101] ii. 1101 [Answer: x = 1101001] iii. 0010 [Answer: x = 0010110] iv iv. Calculate the Hamming distances between each pair of code words generated in parts (i) to (iii), and compare them to the Hamming distances for the original source blocks. [Answer: 4, 7 3, 7, 3 llarger th than 3 3, 4 4, 1] 242

Error Detection •

• •

•

To determine the number of errors a particular code can detect and correct, we look at the minimum Hamming distance between any two code d words. d From linearity the zero vector must be a code word. If we define the minimum distance between any two code words to be dmin = min{dH(a,b), a,b‫א‬C} = min{dH(0,a+b), a,b‫א‬C} = min{wH(c), c‫א‬C, c≠0} where C is the set of code words. The number of errors that can be detected is then (dmin–1), since dmin errors can turn an input code word into a different but valid code word word. Less than dmin errors will turn an input code word into a vector that is not a valid code word.

243

Error Correction • The number t of errors that can be corrected is simply the number of errors that can be detected divided by two and rounded down to the nearest integer, since any output vector with less than this number of errors will ‘nearer’ to the input code word.

dmin 2t 2 +1

dmin< 2t 244

Parity Check Matrix • •

•

•

To decode a block coded vector, it is more complicated. If the generator matrix is of the form G = [Ik ×k ⋮ Pk ×(n −k )] To check for errors, we define a new matrix, the parity check matrix, H. The p parity y check matrix is a (n-k)×n matrix that is defined so that – It will produce a zero vector when no error in the received code vector x, x·HT = 0 (14 1) (14.1) where HT is the transpose of H. To satisfy this condition condition, it is sufficient to write the parity check matrix in the form H = [(Pk ×(n-k))T ⋮ I(n −k ) ×(n −k )] The minimum Hamming distance is equal to the smallest number of columns of H that are linearly dependent. – This follows from the condition ((14.1). ) See Haykin, y , Chap. p 10 for p proof. 245

Syndrome • If the received coded block y contains errors, then the product of the received block with the transpose of the parity check matrix will not be zero, y ‫ ڄ‬HT ≠ 0 • Writing y = x + e which is the sum of the original coded block x and the error e, we find that y ‫ ڄ‬HT = e ‫ ڄ‬HT • This value is referred to as the syndrome, s = y ‫ ڄ‬HT = e ‫ڄ‬ HT. • The syndrome is a function of the error only, and contains the information required to isolate the position (or positions) of the error (or errors).

246

Syndrome Decoding • s is an (n-k) row vector, taking 2(n-k) -1 possible values (the p g to no error). ) zero vector corresponding • This means that it is necessary to calculate and store 2(n-k) -1 syndromes as a look-up table to be able to pin-point the positions of the errors exactly. • Problem: this is impractical for large values of (n – k). Received vector t y

Parity-check matrix H

Syndrome s

Syndrome table

Error E vector e

Decoded codword x

247

Summary • A linear block code can be defined by a generator matrix G and the associated p parity-check y matrix H. • Every linear block code is equivalent to a systematic code. • The key parameter of a linear block code is the minimum Hamming distance dmin: – Up to dmin – 1 errors can be detected; – Up to (dmin – 1)/2 errors can be corrected.

• Syndrome decoding: compute the syndrome and then find th error pattern. the tt – Only practical for short codes.

248


Lecture 15: Cyclic Codes Dr. Cong Ling Department of Electrical and Electronic Engineering

249

Outline • Hamming codes – A special type of cyclic codes that correct a single error

• Cyclic codes – Can correct more errors – The most important p class of block codes – Implementation takes advantage of polynomial multiplication/division

• References – Haykin ay & Moher, o e , Communication Co u cat o Systems, Syste s, 5t 5th ed ed.,, C Chap. ap 10 0 – Lathi, Modern Digital and Analog Communication Systems, 3rd ed., Chap. 16

250

Hamming Codes • Hamming codes are a class of linear block codes that g error. Theyy satisfy y the condition can correct a single r = n – k = log2(n+1)  n = 2r – 1, k = 2r – 1 – r. • From this expression, it is easy to see that the first few Hamming codes correspond to (n, k) = (7,4), (15,11), (31,26),… • They are easy to construct and are simple to use. – For all Hamming codes, dmin = 3. – All (correctable) error vectors have unit Hamming weight, and the syndrome associated with an error in the i’th column of the vector is the i’th row of HT .

• Columns of H are binary representations of 1, …, n.

251

Syndrome Table •

•

For the (7,4) Hamming code, the parity check matrix is 0 1 1 1 1 0 0  H  1 0 1 1 0 1 0  1 1 0 1 0 0 1 

G Generator t matrix ti

1 0 G 0  0

•

0 0 0 0 1 1 1 0 0 1 0 1 0 1 0 1 1 0  0 0 1 1 1 1

The corresponding syndrome table is:

s  eH T •

s

e

000

0000000

001

0000001

010

0000010

100

0000100

111

0001000

110

0010000

101

0100000

011

1000000

When a coded Wh d d vector t iis received, i d th the syndrome d iis calculated l l t d and d any single error identified, and corrected by exchanging the relevant bit with the other binary value – However, problems can occur if there is more than one error. 252

Example • Consider the (7,4) Hamming code. If the code vector 1000011 is sent while 1000001 and 1001100 are received,, decode the information bits. • Answer: – The first vector s=(1000001)HT=(010) e=(0000010) x=(1000001)+(0000010)=(1000011) u=(1000) correct (as there is one error) – The second vector s=(1001100)HT=(000) error-free x=(1001100) ( ) wrong g ((as there are 4 errors)) u=(1001) 253

Cyclic Codes • •

•

•

Cyclic codes are a subclass of linear block codes offering larger Hamming distances, thereby stronger error-correction capability. Whereas the principal property of the simple linear block code is that the sum of any two code words is also a code word, the cyclic property: p y a cyclic y shift of any y code word codes have an additional p is also a code word. A cyclic shift of a binary vector is defined by taking a vector of length n, a = [a0, a1, …, an-1] g g the elements, and rearranging a = [an-1, a0, a1, …, an-2] A code is cyclic if: (c0, c1, …, cn-1) ‫ א‬C ֜ (cn-1, c0, …, cn-2) ‫ א‬C

254

Generator Matrices • A cyclic code is still a linear block code, so all of the properties p p p previously y discussed hold for cyclic y codes. • They are constructed by defining a generator matrix, and an associated parity check matrix and are decoded using syndromes in exactly the same way as the other linear block codes that we have discussed. • A generator matrix is defined in the same way as before, except that the rows are now cyclic shifts of one ndimensional basis vector vector.  g0 0 G   0

g1  g n k

0

g0

g1



g n k











0

g0

g1

0   0       g n k  

255

Encoding •

•

•

The cyclic property leads to a very useful property: they can be represented simply (mathematically and in hardware) by polynomial operations. operations We start by looking at the code word x generated by a source block u, x = u ‫ ڄ‬G = [x0, x1,…, xn-1] With the generator matrix in the cyclic (non-systematic) form given in the previous slide slide, the elements of the code words are are, x0 = u0g0 x1 = u0g1+ u1g0 … xn-1 = uk-1gn-k These elements take the general form, k 1

xl   ui gl i i 0

256

Polynomial Representation •

•

•

•

Taking a binary vector x, it is possible to represent this as a y in z ((over the binary y field)) polynomial x = [ x0 , x1 ,… , xn−1 ] → x (z) = x0 zn−1 + x1 zn−2 +…+ xn−1 Representing the original code word and the n-dimensional basis vector for the cyclic code as polynomials, polynomials u = [u0 , u1 ,… , uk −1 ] → u (z) = u0 zk −1 + u1 zk −2 +…+uk −1 g = [g0 , g1,… , gn−k ,0,…,0] → g (z) = g0 zn−k + g1zn−k−1 +…+ gn−k We notice that the product of these two polynomials is exactly the same as the h polynomial l i l representation i off the h corresponding di code d vector. Nonsystematic y Encoding: g x(z) = u(z) ‫ ڄ‬g(z) This means that the problem of matrix-vector multiplication can be reduced to a problem of polynomial multiplication.

257

Advantage • The main advantage is that it simplifies the cyclic shift operation and thereby simplifies the hardware operation, implementation of the code considerably. • Multiplication of the code word by z shifts all of the coefficients along by one, and replacing the term (x0zn) byy a term ((x0z0)), g gives a cyclic y shift. • A simpler way to achieve the same result is to multiply the polynomial by z, divide by zn-1, and take the remainder term, which can be written as (z x(z)) mod(zn −1)

258

Parity Check Polynomials •

•

• •

One property important to cyclic codes is the ability to factorise polynomials. Given a polynomial of the form a(z) = zn−1 = zn+1 (n > 1), it is always possible to find two polynomials polynomials, such that that, zn+1 = g(z) h(z) g g(z) to be a g generator p polynomial, y , this condition is sufficient for Taking a resultant code to be an (n,k) cyclic code, where: – g(z) is a polynomial of degree n−k = r. – h(z) is i a polynomial l i l off d degree k. The other factor, the polynomial h(z), turns out to be a parity check polynomial p y , playing p y g the same role as the p parity y check matrix. If we have a valid code word, x(z), the product of the code word with h(z) is zero (modulo zn+1), and if we have a code word contained an error, error (y(z)h(z))mod(zn+1) = [(x(z)+e(z))h(z)]mod(zn+1) )]mod((zn+1)) = [e(z)h(z)] which is called the syndrome polynomial, s(z). 259

Encoding/Decoding • Encoding: x(z)=u(z)g(z); • Parity check polynomial: h(z) =[zn+1] /g(z) • Decoding: (y(z) is the received vector) – Calculate the syndrome polynomial:

s(z)=(y(z)h(z))mod(z s(z) (y(z)h(z))mod(zn+1) 1) – Look up the syndrome table to get e(z) from s(z); – x(z)=y(z)+e(z); – u(z)=x(z)/g(z);

• Thi This iis th the en/decoding /d di procedure d ffor non-systematic t ti codes. A modified version works for systematic codes (non-examinable) (non-examinable). 260

Hardware Implementation •

Binary polynomial multiplication: multiply-by-g(z) a(z) g0

•

g2

g1

g3

b(z)

g(z)= ) g0+g1z+g2z2+g3z3, g0=g3=1; 1 Binary polynomial division: divide-by-g(z)

g0

g1

g2 b(z)

a(z) g(z)=g0+g1z+g2z2+g3z3, g0=g3=1;

261

Examples of Cyclic Codes • • • •

Hamming code (used in computer memory) Cyclic redundancy check (CRC) code (used in Ethernet, ARQ etc.) B Bose-Chaudhuri-Hocquenghem Ch dh i H h (BCH) code d Reed-Solomon (RS) code (widely used in CD, DVD, hard disks, wireless, satellites etc.))

SNR per bit (dB)

262

Applications of Coding • The first success was the application of convolutional p space p p probes 1960’s-70’s. codes in deep – Mariner Mars, Viking, Pioneer missions by NASA

• Voyager, Galileo missions were further enhanced by concatenated codes (RS + convolutional). • The next chapter was trellis coded modulation (TCM) for voice-band modems in 1980’s. • 1990’s saw turbo codes approached capacity limit (now used d iin 3G) 3G). • Followed by another breakthrough – space-time codes in 2000’s 2000 s (used in WiMax WiMax, 4G) • The current frontier is network coding which may widen the bottleneck of Internet Internet. 263

264

265

266

T Trigonom metic Iden ntities


Revision Lecture Dr. Cong Ling Department of Electrical and Electronic Engineering

267

Lectures Introduction and background 1. Introduction 2. Probability and random processes 3 Noise 3. Effects of noise on analog communications 4. Noise performance of DSB 5. Noise performance of SSB and AM 6. Noise performance of FM p for FM and 7. Pre/de-emphasis comparison of analog systems

Digital communications 8. Digital representation of signals 9. Baseband digital transmission 10. Digital modulation 11 Noncoherent 11. N h td demodulation d l ti Information theory 12 Entropy and source coding 12. 13. Channel capacity 14. Block codes 15. Cyclic codes

268

The Exam • The exam paper contains 3 questions. All questions are p y compulsory. • For example, the questions may look like – Question 1 (40 marks): Basic knowledge of communication systems, elements of information theory and/or coding, mostly bookwork – Question 2 (30 marks): Analog or digital communications communications, bookwork, new example/application/theory – Question 3 (30 marks): Digital or analog communications or i f information i theory/coding, h / di b bookwork, k k new example/application/theory

• Sample questions: – Past papers – Problems in classes 269

Introduction, Probability and Random Processes • Primary resources in communications: power, bandwidth, cost • Objectives of system design: reliability and efficiency • Performance measures: SNR or bit error probability • Probability distribution: Uniform distribution, Gaussian distribution, Rayleigh y g distribution, Ricean distribution • Random process: stationary random process, autocorrelation and power spectral density, Wiener-Khinchine relation 

S X ( f )   RX ( )e  j 2f d 

270

Noise • Why is noise important in communications? How does performance? What types yp of noise exist? noise affect the p • White noise: PSD is constant over an infinite bandwidth. • Gaussian noise: PDF is Gaussian. • Additive white Gaussian noise • Bandlimited noise, bandpass representation, baseband noise nc(t) and ns(t), power spectral density n(t )  nc (t ) cos((2f ct )  ns (t ) sin( i ( 2f ct )  S ( f  f c )  S N ( f  f c ), | f | B Sc ( f )  S s ( f )   N otherwise 0,

271

Noise performance of AM • Signal-to-noise ratio

SNR 

PS PN

• Baseband communication model

SNRbaseband 

PT N 0W

• AM, DSB-SC, SSB, synchronous detection, envelope detection • Output SNR

272

Noise performance of FM • FM modulator and demodulator • Method to deal with noise in FM: linear argument at high SNR • Derivation of the output SNR, threshold effect

• Pre-emphasis and de-emphasis, how they increase the output SNR 273

Digital communications • PCM: sample, quantize, and encode  3P  • Quantization noise and SNR SNRo (dB)  6n  10 log10  2  (dB)  mp 

• Companding (A/-law) and line coding • Baseband data transmission, effects of noise, and probability of error

274

Noise performance of bandpass digital communications • Modulation formats: ASK, FSK, PSK • Coherent detection and its error probability • Noncoherent detection and its error probability (including differential detection for DPSK) • Q-function Q function Q(x): computation by using the graph or approximation

1  x2 / 2 Q( x)  e ,x0 2 x 275

Performance of digital modulation Scheme

Bit Error Rate

Coherent ASK

Q(A/2)

Coherent FSK

Q(A/2)

Coherent PSK

Q(A/)

Noncoherent ASK

½ exp( exp(-A A2/82)

Noncoherent FSK

p( 2/42) ½ exp(-A

DPSK

½ exp(-A2/22)

Noncoherent ASK, FSK Coherent ASK, FSK

Coherent PSK

Caution: ASK and FSK have the same bit error rate t if measured d by b average SNR SNR. Average SNR, dB 276

Information theory • The entropy of a discrete memoryless information source K

H ( S )    pk log2 pk k 1

• Entropy function (entropy of a binary memoryless source) H ( S )  (1  p ) log 2 (1  p )  p log 2 p  H ( p )

• Source coding theorem: The minimum average codeword length for any source coding scheme is H(S) for a discrete memoryless source. • Huffman coding: An efficient source coding algorithm.

277

Channel coding theorem • If the transmission rate R ≤ C, then there exists a coding g scheme such that the output p of the source can be transmitted over a noisy channel with an arbitrarily small probability of error. Conversely, it is not possible to transmit messages without error if R > C. • Shannon formula  P  C  B log 2 1  SNR   B log 2 1   N B 0  

278

Channel coding • Block vs. convolutional codes • Binaryy fields and vector space, p , Hamming g distance/weight g • A linear block code can be defined by a generator matrix G and the associated parity-check matrix H. • Every linear block code is equivalent to a systematic code. • The key parameter of a linear block code is the minimum Hamming distance dmin: – Up to dmin – 1 errors can be detected; – Up p to (d ( min – 1)/2 ) errors can be corrected.

• Syndrome decoding: compute the syndrome and then find the error pattern. • Hamming codes • Cyclic codes: (polynomial representation is not examinable) 279

Appendix: More Background on Probability

Probability

•

Sample Space – S : Set of all random experiment outcomes – S = { s : s iis an outcome t }

•

Examples – For tossing a coin, S = { H, T } – Roll a die, S = { 1, 2, …, 6 }

•

Event E  S – Roll a die twice: S = { (H,H), (H,T), (T,H), (T,T) } – Event E = { (H,H), (T,T) } – A collection of events, F. Obviously, F  S

•

A probability measure on F is a function P : F → [0,1] [0 1] satisfying the probability axioms: – P(S) = 1 – P(F) ≥ 0 – For events A, B belonging to F, if A ∩ B=0, P(A U B) = P(A) + P(B)

281

Random Variables

•

A random variable X(s) is a variable whose value depends on the outcome s of a random experiment with the defined probabilityy measure.

•

For convenience, we simply denote the random variable by X.

•

Consider a gamble in Macau (an example of discrete random variables):

Loss Draw Win P=3/8 P=1/4 P=3/8

-$5 $5

•

0

$5

 5  X (s)   0 5 

s W sD

S W  D L

sL

P ( X ( s )  5)  P ( s : s  W )  P (W ) 

3 8

P ( X ( s )  0)  P ( s : s  D )  P ( D ) 

1 4

X(s) P ( X ( s )   5)  P ( s : s  L )  P ( L ) 

3 8

Continuous random variables: take values that vary continuously, e.g., water flow of River Thames through London Bridge 282

CDF and pdf

• •

Cumulative Distribution Function (CDF), also known as Probability Distribution Function Probability Density Function (pdf)

CDF : F X ( x )  P ( X  x ) pdf : f X ( x ) 

dF X ( x ) dx

x

FX ( x ) 



f X ( y ) dyy

 

FX (  ) 



f X ( y ) dyy  1

 b

P ( a  X  b )  FX ( b )  FX ( a ) 



f X ( y ) dy

a

Since F X ( x ) is non - decreasing f X ( x) 

dF X ( x ) dx

0 283

Interpretation of pdf

•

If  x is sufficiently small,

P ( x  X  x  x ) 

f X ( y)

x  x

 f X ( y ) dyy  f X ( x )  x

x

Area

f X ( x)

x

x •

Expectation operator (expected value or average):

E[ X ]   X  •

y

Variance:





y f X ( y ) dy





 X2  E[( X   X )2 ]   ( y   X )2 f X ( y)dy  E[ X 2 ]   X2 

284

Moments of a Random Variable

•

Moments:  r  E [ X r ] Average (mean) of X:

•

for r  1 ,2 ,3 ,...

 1  E [ X ] or m X

Central moments,, centered on the mean

 r '  E [| X  m X |r ] •

for r  1 ,2 ,3 ,...

C Comments

1 '  E[ X  m X ]  E[ X ]  m X  0 Variance : 2 '  E[| X  m X |2 ] Standard deviation :  X  2 ' which gives a measure of “dispersion” of X about its mean

285

Examples of Discrete Distributions

•

Let p + q = 1

•

Bernoulli distribution P ( X  1)  p

P ( X  0)  q

with p representing the probability of success in an experiment •

Bi Binomial i l distribution di t ib ti n P (Y  k )    p k q n  k k 

k  0 ,1, 2 ,..., n

Y represents the total number of successes, in an independent p trial of n Bernoulli experiments •

Exercise: Verify

E [ X ]  p ,  X2  pq E [Y ]  np ,  Y2  npq 286

Examples of Continuous Distributions: Exponential Distribution

f X ( x)

 x

0

f X ( x )  e x

•

E Exercise: i V if Verify

for x  0

E(X ) 

 X2 

1



1

2

287

Normal (Gaussian) Distribution

f X ( x)

 m

0

f X ( x)  FX ( x )  •

x

1 2  1 2 

e

( x  m )2  2 2

x

e

for    x  

( y  m )2  2 2

dy



Exercise: Verify E ( X )  m



X

2



2 288

Rayleigh and Rice Distributions

•

Define a random variable R 

X

2

Y

2

where X and Y are

independent Gaussian with zero mean and variance  2 •

R has Rayleigh distribution:

f R (r )  •

r

2

e

 r 2 /( 2 2 )

r0

If X has nonzero mean A, R has Rice distribution:

f R (r )  where

I0 ( x) 

1 2

e 2 r



2

0

 ( r 2  A2 ) /( 2 2 )

e x cos d

I 0 ( Ar2 ) r  0

is the modified zero-order

Bessel function of the first kind

289

Conditional Probability

•

Consider two events A and B, not necessarily independent of each other – Define P(A|B) = Probability of A given B

•

Carry out N independent trials (experiments) and count – N(A) = Number of trials with A outcome – N(A,B) = Number of trials with both A and B outcome

•

By definition N ( A) as N N ( A,B ) B )  N (B)

P ( A)  P(A |

P(A  B)  Similarly, •

N ( A,B ) N

N  



N ( A,B ) N ( A) N ( A) N

 P ( B | A)P ( A)

P( A  B )  P( A | B )P(B )

Statistical independence between A and B

P( A | B)  P( A)  P( A  B)  P( A)P(B) 290

Joint Random Variables

Random variables : y

x

FXY ( x , y ) 

X ,Y

f

XY

( u , v ) dudv

  

f XY ( x , y ) : the joint pdf FXY ( x , y ) : the joint CDF •

Properties of joint distribution:  

1) FXY ( ,  )    f XY (u, v )dudv  1 2a ) 2b )

f X ( x)  fY ( x ) 

  

 f XY ( x, y )dy

y   

 f XY ( x, y )dx

x  

 2 FXY ( x , y ) xy

3)

f XY ( x, y ) 

4)

X , Y are independen t  f XY ( x, y )  f X ( x ) fY ( y ) 291

Conditional CDF and pdf

Define FY ( y | x ) as the conditional CDF for Y given X = x. y x  x By conditional probability probability, f XY ( u , v ) dudv   FY ( y | x  X  x   x )  P ( YP (xy, xXXx xx ) x )    xx x  f X ( u ) du x

y



 f XY ( x ,v )  xdv



f X ( x ) x

y

As  x  0,

FY ( y | x ) 

 f XY ( x ,v ) dv



fX (x)

The conditional pdf

fY ( y | x ) 

dFY ( y | x ) dy

 fY ( y | x ) 

f XY ( x , y ) fX (x)

292

Joint Distribution Function of Several Random Variables

•

The joint PDF of n random variables

X 1 ,..., X n FX 1 X 2 ... X n ( x1 , x2 ,... xn )  P ( X 1  x1 , X 2  x2 ,... X n  xn ) •

The joint pdf

f X1 X 2 ... X n ( x1 , x2 ,...xn )  •

 n FX1X 2 ... X n ( x1 , x2 ,... xn ) x1x2 ...xn

Independent random variables

FX1X2 ...Xn (x1, x2 ,...xn )  FX1 (x1)FX2 (x2 )...FXn (xn ) f X1X2 ...Xn (x1, x2 ,,...xn )  f X1 (x1) f X2 (x2 ))...f Xn (xn ) •

Uncorrelated random variables

E[ X i X j ]  E [ X i ]E [ X j ]  i, j , i  j 293

Covariance and Correlation Coefficient

Covariance of X and Y:

cov( X , Y )  E [( X  m X )( Y  mY )] Correlation Coefficient: X ,Y )  XY  cov(    X

Property 1:  1 

Y

E [( X  m X )(Y  mY )]

 XY

 XY  1

Property 2: X and Y are linearly related ( i .e., Y  aX  b ) if and only if  XY   1 Property 3: If X and Y are independent,  XY  0 Caution: The converse of Property 3 is not true in general. That is, if  XY  0 , X and Y are not necessarily independent! 294

Joint Gaussian distribution

X, Y jointly normally distribute d with m X  mY  0,  X   Y   f XY ( x , y ) 

2 2

1 2 1  XY



exp 

1 2 2 (1  XY ) 2

( x 2  2  XY xy  y 2 )



The marginal density : 

fY ( y ) 

f

XY

( x , y ) dx 

x  

1 2 

exp{ 

Conditiona l density function : f X |Y ( x | y ) 

f XY ( x , y ) fY ( y )



1 2 2  1  XY



y2 2 2

exp 

}

1 2 2 (1  XY ) 2

( x   XY y ) 2



Note that this conditional pdf is also a normal density function with 2 2 mean  XY y and variance  (1   XY ) . The effect of the condition (having Y=y on X) is to change the mean of X to  XY y and to 2 reduce d th the variance i b by  2  XY . 295

Independent vs. Uncorrelated • Independent implies Uncorrelated • Uncorrelated does not imply Independence • For jointly Gaussian random variables, Uncorrelated implies Independent (this is the only exceptional case!) • Exercise: verify the above claim of jointly Gaussian random variables

296

An example of uncorrelated but dependent random variables

Let  be uniformly distributed in [0,2 ]

f ( x)  21 for 0  x  2 Define random variables X and Y as

X  cos Y  sin

Clearly, X and Y are not independent. In particular, for a given  , X and Y are dependent. If X and Y were independent, we should see possible sample points of (X,Y) assume all possible values of X Y Locus of and Y in a unit square. q X and Y But X and Y are uncorrelated as   XY  E[( X  m X )(Y  mY )] X

 E [ XY ] 2

 21 0 cos sin d  0! 297

Central Limit Theorem

•

n

Define S n   X i i 1

where X i ' s are i.i.d. with •

E[ X ]  ,

Define a new random variable,, R n 

 X2   2   S n  n n

Central Limit Theorem : x

lim P ( R n  x )  n 

•

1 2

e

 y2 / 2

dy



Importance: The “shifted and scaled” sum of a very large number of i.i.d. random variables has a Gaussian distribution

298