Distributed Source Coding in Wireless Sensor

1 downloads 0 Views 2MB Size Report
this a joint source-channel coding design that is utilizing a basic method of distributed source .... of decimals it would require an infinite number of bits to be fully represented. ... source this update procedure will result in a quantizer with quantization ... means that, for a fixed bit rate, less bits can be used by the SOSQ. The. 5 ...
Distributed Source Coding in Wireless Sensor Networks

JOHANNES KARLSSON

Master’s Degree Project Stockholm, Sweden 2006-06-30

XR-EE-KT 2006:006

Abstract The advent of wireless sensor networks adds new requirements when designing communication schemes. Not only should the sensor nodes be energy efficient but they should preferably also be small and cheap. In the light of this a joint source-channel coding design that is utilizing a basic method of distributed source coding is investigated. An iterative algorithm for designing such system for two correlated Gaussian sources is proposed and evaluated. It is shown that the proposed system finds a way of using the correlation of the sources to both reduce the quantization distortion and introduce protection against channel errors.

i

ii

Acknowledgments The work presented in this thesis was conducted during the spring of 2006 in the Communication Theory group at the Department of Signals, Sensors and Systems (S3) at the Royal Institute of Technology (KTH). I would first of all like to thank my supervisor Niklas Wernersson for all of his help throughout the work with this thesis. Also big thanks to my examiner Prof. Mikael Skoglund for letting me do my thesis in the Communication Theory group. Finally I would like to thank my family and all my friends who have been supporting me through my years of studies, and also thanks to my Heavenly Father for all his blessings!

iii

iv

Contents 1 Introduction 1.1 Quantization . . . . . . . . . . . . . . . . . . . . . . 1.2 Source Optimized Scalar Quantization . . . . . . . . 1.3 Channel Optimized Scalar Quantization . . . . . . . 1.3.1 Optimal Quantizer Design for Noisy Channels 1.4 Distributed Source Coding . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

1 2 3 4 6 9

2 Channel Optimized Scalar Quantization of Correlated Sources 13 2.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . 13 2.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 2.2.1 Finding the Optimal Encoder q1 . . . . . . . . . . . . . 15 2.2.2 Finding the Optimal Decoder g1 . . . . . . . . . . . . . 16 2.2.3 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3 Numerical Results . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3.1 Channel Mismatch . . . . . . . . . . . . . . . . . . . . 20 2.3.2 Correlation-SNR Mismatch . . . . . . . . . . . . . . . 22 3 Reducing Complexity 3.1 Multistage Quantization . . . . . . 3.1.1 Analysis . . . . . . . . . . . 3.1.2 Numerical Results . . . . . 3.2 Dual COSQ with Optimal Decoders 3.2.1 Analysis . . . . . . . . . . . 3.2.2 Numerical Results . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

. . . . . .

25 25 25 27 28 28 29

4 Conclusions 31 4.1 Comments on the Results . . . . . . . . . . . . . . . . . . . . 31 4.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 A Derivations 33 A.1 pdf of a Gaussian Variable . . . . . . . . . . . . . . . . . . . . 33 v

A.2 A.3 A.4 A.5

pdf of the Sum of Two Gaussian Variables Conditional pdf of a Gaussian Variable . . P (i2 | x1 ) . . . . . . . . . . . . . . . . . . . Reconstruction Points . . . . . . . . . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

. . . .

34 34 35 36

B Numerical Results

37

Bibliography

42

vi

Chapter 1 Introduction Wireless Sensor Networks (WSNs) are emerging and are expected to play an important role in tomorrow’s sensor technology. A WSN is a network of several sensor nodes that may communicate both with each other and/or a fusion center where the information from the sensors is processed. The possible uses of WSNs are many and could for example be: environmental control, monitoring of industrial processes, intelligent homes, etc. Some requirements on the sensor nodes are that they should be energy efficient, small and cheap. The requirement to be energy efficient is especially important because the sensors usually run on batteries and it would be good if the battery lasted for the entire lifetime of the sensor. The energy constraint further means that the sensor must not use too much energy for communication or data processing. Reducing the energy spent on communication leads to a system that is more sensitive to disturbances, from for example other wireless devices. If the sensors are used in a control system, or any other real-time application requiring short delays, the sensors must transmit their readings immediately. This means that the communication scheme should be designed for short blocks of data. An example scenario is a WSN consisting of several temperature sensing nodes placed with a high spatial density all over a building. This would make it possible for the environmental control system to be more precise since the temperature is known not only in one or two points in the building but in a dense set of points in the entire building. Of course, two sensor nodes that are close to each other are expected to have a high correlation between their readings. With all of the above in mind the objective with this thesis is to investigate how one can find a simple coding scheme for transmitting measurements from sensors with correlated readings over a channel with disturbances. 1

Figure 1.1: A simple system for transmitting an analog value over a digital channel.

The outline of the thesis is as follows: Chapter 1 gives an introduction to the main ideas and concepts that are needed throughout the rest of the thesis. Chapter 2 presents an algorithm for designing a system that can be used to transmit readings from two sensors with correlated readings. Chapter 3 investigates some modifications of the algorithm from the previous chapter to reduce complexity. Chapter 4 presents some of the conclusions that have been made and also some ideas for future work.

1.1

Quantization

A simple communication system can be seen in Figure 1.1, where the aim is to transmit an analog value X ∈ R over a digital channel. The analog value needs to be translated into bits in order to be transmitted over the digital channel. The problem is that since an analog value have an infinite number of decimals it would require an infinite number of bits to be fully represented. The solution is to divide all values of X into a finite number of quantization regions. For each region a unique combination of bits, called a codeword, is assigned. Instead of transmitting the actual value of X, the encoder q finds which of the regions that the value belongs to and sends the codeword i associated with this region. On the receiving side, the decoder g makes ˆ of X based on the received codeword. The procedure a reconstruction X just described is called quantization and is a necessity in all digital systems, such as for instance CD-players and cell phones, that are dealing with analog signals. All values of X that belong to the same region will be reconstructed to ˆ will in general not be equal. The the same value and because of this X and X ˆ is called distortion. The goal is of course to minimize error introduced to X ˆ as similar to X as possible. To be able to do so the distortion by making X we need a measure for the distortion. The most widely used measure is the mean square error (MSE) defined as ˆ = E[(X − X) ˆ 2] D(X, X) 2

(1.1)

Figure 1.2: Example of a 2 bit uniform quantizer.

where E[ ] is the expected value operator. One important property with this ˆ ≥ 0 with equality if and only if X = X. ˆ The measure is that D(X, X) amount of distortion varies with the number of bits and the quantizer itself. The variation with the number bits can be understood from the fact that b bits will give 2b different codewords. If each codeword is associated with one region, increasing b with one will give twice as many regions which makes it possible for the size of each region to be half of the previous size. The ˆ associated with each region will therefore better approximate all of the X values that are encoded to that region. The simplest form of quantization is to use a fixed quantization step δ and divide a closed interval on the real line into disjoint regions of length δ. Each region is assigned a unique codeword from which an approximation of X can be made. By using a fixed quantization step you get a uniform quantizer which is the optimal scalar quantizer for a uniformly distributed source defined on this interval. Another thing that makes the uniform quantizer simple is that for a uniformly distributed source the best reconstruction points in MSE sense are the centers of the corresponding regions. An example of a 2 bit uniform quantizer for a uniform source distribution between −1 and 1 can be found in Figure 1.2.

1.2

Source Optimized Scalar Quantization

For sources that are not uniformly distributed, among others, Lloyd [1, 2] and Max [3] have independently developed similar algorithms for designing a scalar quantizer that minimizes the MSE. This quantizer is commonly known as the Lloyd-Max quantizer. The algorithm consists of the following steps that iteratively improve the encoder/decoder pair 1. Choose a set of initial reconstruction points for the decoder. 2. Find the encoder that minimizes the MSE given the decoder. 3

Figure 1.3: 3 bit quantizer optimized for a zero-mean Gaussian source with unit variance.

3. Find the decoder that minimizes the MSE given the encoder. 4. Evaluate the MSE, if the decrease compared to the last iteration is less than some threshold stop the iteration, otherwise go to Step 2. The reason for doing the search in this way is because it is hard to optimize the encoder at the same time as the decoder but simple to optimize the encoder for a fixed decoder and vice versa. For a non-uniformly distributed source this update procedure will result in a quantizer with quantization regions of different lengths. An example of this can be seen in Figure 1.3 where a 3 bit quantizer for a zero-mean Gaussian source with unit variance is shown. Since the system is optimized for the source it is called Source Optimized Scalar Quantization (SOSQ). This method can be extended to the more general case of vector quantization, where the quantizer operates on blocks of data, and is then referred to as the generalized Lloyd algorithm or k-means algorithm [4]. The problem with the above algorithms is that they will converge to a local optimum which may not be the global optimum. The quality of this local optimum depends on the initial choice of reconstruction points for the decoder. One way to obtain solutions that are close to the global optimum is to use the LBG algorithm proposed by Linde, Buzo and Gray [5]. To find an N bit quantizer the LBG algorithm uses the solution of an N −1 bit quantizer as initial reconstruction points. This is done by splitting every reconstruction point from the N −1 bit quantizer into two different points separated by some small distance ε. In this way the LBG algorithm gradually finds a solution for more number of bits, starting with the simple case of N = 0 bits where the reconstruction point is the mean of the source samples.

1.3

Channel Optimized Scalar Quantization

In the previous sections we have assumed that the channel was error free, that is if we sent something we were sure that the receiver would receive the 4

Figure 1.4: The Binary Symmetric Channel.

Figure 1.5: A basic system with a channel encoder/decoder pair creating a channel that is close to error free.

exact same thing. For a real channel there is a probability ε > 0 that the received value is different from the transmitted value. The channel model that will be used throughout this thesis is the Binary Symmetric Channel (BSC). The BSC is a discrete channel where there is a probability ε that the bit you send will change value, from 0 to 1 or from 1 to 0. The probability that you receive the same value as you sent is thus 1 − ε. This is illustrated in Figure 1.4. All bits are corrupted independently of each other and it is therefore straightforward to calculate the transition probabilities for the codewords denoted P (j | i), which is the probability that j is received given that i was sent. For example if ε = 0.1, given that 000 is sent the probability of receiving 000 is 0.93 = 0.729 and the probability of receiving 011 is 0.9 ∗ 0.12 = 0.009. As soon as there are errors on the channel the SOSQ is not optimal anymore since it is optimized for an error free channel. There are two ways to deal with channel errors, one is to keep the SOSQ and add a separate channel encoder/decoder pair with a strong error correcting code around the channel and in this way create a channel that is close to error free, see Figure 1.5. The channel encoder adds a controlled amount of redundancy to the data that is transmitted. The redundancy is later used by the channel decoder to detect transmission errors and correct these. The channel will however still have an error probability PERR > 0 that can be made arbitrarily close to zero at the cost of adding more redundancy. The added redundancy means that, for a fixed bit rate, less bits can be used by the SOSQ. The 5

Figure 1.6: A system where joint source-channel coding is used.

channel encoder/decoder must further operate on long blocks of data to be efficient. Another approach is to take the channel into consideration when the quantizer is being designed as in Figure 1.6. The system is then called Channel Optimized Scalar Quantization (COSQ). There are pros and cons with both ways of designing the system. The advantage of using a separate channel encoder/decoder pair is that it is easy to adapt the system to different source data or to different channels. In theory there should be no loss in performance in doing the coding separately, however this is under the assumption that the encoders operates on blocks of data of infinite length [6]. In a real-time system where you want short delays this is obviously not possible. The joint source-channel coding design where the encoding is done in one step is expected to give better performance when operating on short blocks of data compared to the separate design. The drawbacks of the joint design is that the complete system must be remade if either the source or the channel is changed in some way. Due to the expected performance gain when operating on short blocks of data, the joint sourcechannel coding design will be studied in this thesis.

1.3.1

Optimal Quantizer Design for Noisy Channels

This section is a brief review of the work done by Farvardin and Vaishampayan [7] on this topic. The system studied is the same as in Figure 1.6 with the assumption that X can be modeled as a discrete-time zero-mean memoryless stationary 2 process with probability density function (pdf) fX (x) and variance σX = 2 E[X ] < ∞. The idea behind the proposed algorithm is the same as in the Lloyd-Max algorithm with the extension to take the channel into consideration. That is, given the decoder g find the best encoder q and given the encoder find the best decoder in MSE sense. The encoder is created by dividing the support of fX (x) into disjoint regions Ai , i = {1, 2, . . . , 2b }, where Ai is defined as the collection of all values of x that should be encoded to codeword with index i and b is the number of bits to use in the encoder. To minimize the MSE, Ai must satisfy 6

ˆ 2 | q(x) = i] ≤ E[(x − X) ˆ 2 | q(x) = k], ∀k 6= i}. Ai = {x : E[(x − X)

(1.2)

This can equivalently be expressed as b

Ai =

2 \

Aik

(1.3)

k=1 k6=i

where Aik is defined as ˆ 2 | q(x) = i] ≤ E[(x − X) ˆ 2 | q(x) = k]}. Aik = {x : E[(x − X)

(1.4)

The set Aik specifies all points x that will give a lower or equal MSE if they are mapped to codeword i instead of codeword k. By expanding the squares and rearranging the terms in (1.4), it can be seen that the inequality is between two functions that are linear in x. In [7] it is further shown that because of this Ai must be an interval and an analytical expression for finding the endpoints of the interval for a given decoder is derived. It is a well known fact from estimation theory that in order to minimize the MSE the decoder is calculated as the expected value of x given that a certain codeword j is received xˆ(j) = E[X | j].

(1.5)

In the same way as the Lloyd-Max algorithm in Section 1.2 an encoder/decoder pair is found by iteratively improving the encoder and decoder until they have converged to a local optimum. As in the case of the Lloyd-Max algorithm the final solution will depend on the initial reconstruction points, but since we are now dealing with a channel where there will be errors, the initial codeword assignment is also important. The initial codeword assignment is the way the reconstruction points are associated with different codewords in the initial decoder. The reason why this is important is because sometimes when the codeword 000 is transmitted it will be received as 001, 010 or 100. To be robust against channel errors the codewords should be assigned so that bit changes in a single position do not result in big numerical change in the reconstructed value. Three different codeword assignments can be found in Figure 1.7. The Natural Binary Code (NBC) and the Folded Binary Code (FBC) are both quite good codes. If one has to choose one of these it turns out that in general the NBC is the best choice for a uniformly distributed source and the FBC is the best choice for a Gaussian distributed 7

Figure 1.7: A 3 bit quantizer with different codeword assignments.

source [7]. The Bad Code in the figure is an example of a code that should not be used. All bit changes in a single position will result in a reconstruction where even the sign is wrong. To be able to compare the performance of different systems we use the Signal-to-Distortion Ratio (SDR), defined as µ ¶ E[X 2 ] SDR = 10 log10 . (1.6) ˆ 2] E[(X − X) It can easily be seen that this is nothing but the ratio between the variance of the source and the MSE of the reconstructed signal. SDR = 0 dB means that the MSE is as large as the signal itself and SDR = ∞ dB means that the signal is perfectly reconstructed. An example of how channel errors affect different codeword assignments can be seen in Figure 1.8 where the Bad Code from the previous example is compared to the FBC. 1000000 samples of a zero-mean Gaussian source has been encoded with a SOSQ and sent over a channel with crossover probability ε = 0.1. The SDR is 2.2 dB for the FBC, 1.6 dB for the NBC and −0.9 dB for the Bad Code. In other words, in this case the Bad Code is so bad that it would have been better to neglect the received value and decode everything to 0 since this would have given an SDR of 0 dB. Something that is different compared to the Lloyd-Max quantizer is that, for high crossover probability ε, some of the sets Ai are empty, which means that the corresponding codeword will never be transmitted. For example in the case when ε = 0.1 and 4 bits/sample are used, only 6 of the 16 available codewords are used by the encoder. When the receiver receives any of the ˆ is reconstructed as the expected 10 codewords that never are transmitted X value of X given the received codeword (as is always the case) according to (1.5). The codewords that are not used are such that they, by single bit errors, would be mistaken for the 6 codewords that are used. This shows how the joint design finds a tradeoff between quantization distortion and robustness against channel errors in order to minimize the MSE. 8

(a)

(b)

Figure 1.8: These figures show 40 samples of a Gaussian source encoded with 3 bits and sent over a BSC with crossover probability ε = 0.1. The dashed line shows the source signal and the solid line shows the reconstructed signal at the receiver. In (a) the codeword assignment is according to the Bad Code and in (b) the codeword assignment is according to the FBC. Both transmissions are affected by errors at the exact same bit positions. It can be seen that the Bad Code is more sensitive to channel errors than the FBC.

In Figure 1.9 the performance of this quantizer is compared to the LloydMax quantizer for different crossover probabilities ε. In some sense the comparison is not fair since the channel optimized quantizer has been optimized for the specific ε in each point and the Lloyd-Max quantizer is designed without being aware that there will be channel errors. Nevertheless it shows that it is useless to put many bits in a source optimized quantizer if the signal is to be transmitted over a channel with errors.

1.4

Distributed Source Coding

To explain the idea behind Distributed Source Coding (DSC) the concept of entropy is needed. The entropy of a discrete random variable X is denoted H(X) and could be seen as the minimum number of bits needed to encode X without any loss of information. However this is a theoretical bound and to achieve this bound the encoder may have to operate on blocks of infinite length. Similarly the joint entropy H(X, Y ) of two discrete random variables X and Y can be seen as the minimum number of bits needed to encode X and Y jointly. If X contains any information about Y then H(X, Y ) < H(X) + H(Y ). The joint encoding of X and Y could for example be done by first encoding Y to H(Y ) bits/sample and then X to H(X | Y ) bits/sample, which is the entropy of X if Y is known and by definition H(X, Y ) = H(Y )+ H(X | Y ). An example of how this could be used can be seen in Figure 1.10(a) where Y is known to both the sender and the receiver. Because of this it is enough to use H(X | Y ) bits/sample to encode X instead of H(X) ≥ 9

Figure 1.9: Channel optimized scalar quantizer (solid) in comparison to the Lloyd-Max quantizer (dashed) for different crossover probabilities ε and different bit rates. The source that is being quantized is Gaussian distributed.

10

(a)

(b)

Figure 1.10: In (a) X is encoded to H(X | Y ) bits/sample with knowledge of Y at the encoder and in (b) X is still encoded with H(X | Y ) bits/sample but without any knowledge of Y at the encoder.

H(X | Y ) bits/sample. Nothing strange so far, but the question that arises is what would happen if Y was not known to the sender but only to the receiver as shown in Figure 1.10(b)? Slepian and Wolf [8] showed that if X and Y are two discrete random variables and Y is known at the receiver, then it is still possible to achieve a coding scheme where only H(X | Y ) bits/sample are used for the encoding of X. This is called Distributed Source Coding. Here comes an example that illustrates how this is possible. Let X and Y be two discrete random variables each of 3 bits. Let the correlation between them be such that the Hamming distance is ≤ 1. This means that they differ in at most one bit. If Y is known only to the decoder, then X can be compressed to 2 bits/sample by grouping all possible outcomes of X into the following sets {000, 111}, {001, 110}, {010, 101} and {100, 011}. Instead of transmitting the value of X it is enough to transmit a 2 bit index telling which of the sets X belongs to. This is because the Hamming distance of the elements in all sets is 3 but the Hamming distance between X and Y is ≤ 1. Which means that the true value of X is the value that is closest to Y in the selected set. Wyner and Ziv later extended the results by Slepian and Wolf to the case of lossy encoding of continuous-valued Gaussian variables [9]. However neither Slepian and Wolf nor Wyner and Ziv showed how to achieve these results in practice. For more information about DSC and different approaches to implement it the reader is referred to [10], from where the example is taken, and [11].

11

12

Chapter 2 Channel Optimized Scalar Quantization of Correlated Sources 2.1

Problem Formulation

Having seen the results from the joint source-channel code design in Section 1.3 and some theoretical results of DSC in Section 1.4, the question one might ask is if it is possible to combine them. In this thesis that is exactly what we will try to do, the approach is to use the algorithm from the COSQ and extend it to take the correlation of two random variables into account. The system that will be studied can be seen in Figure 2.1. Two Gaussian distributed random variables X1 and X2 are to be encoded and transmitted to a fusion center. The encoding is to be done in a distributed manner by the two encoders q1 and q2 , the codewords i1 and i2 from the encoders are sent over a BSC with crossover probability ε and finally at the fusion center a pair of decoders, g1 and g2 , should reconstruct X1 and X2 based on the received codewords j1 and j2 . X1 and X2 are correlated according to X1 = Y + N1 X2 = Y + N2

(2.1) (2.2)

where Y , N1 and N2 are three independent zero-mean Gaussian distributed 2 2 random variables with variances σY2 , σN 1 and σN 2 respectively. In this thesis 2 2 2 we will furthermore make the simplification of letting σN 1 = σN 2 = σN . For a measure of the correlation between X1 and X2 , the correlation-SNR is defined as µ 2¶ σY CSN R = 10 log10 . (2.3) 2 σN 13

Figure 2.1: System with two signals X1 and X2 that are encoded separately but decoded jointly at a fusion center.

CSN R = −∞ dB means that X1 and X2 are uncorrelated and CSN R = ∞ dB means that they are fully correlated. Given a system with these specifications the objective is to find a pair of b bit encoders, q1 and q2 , and the corresponding joint decoders, g1 and g2 , that minimizes the distortion D, defined as the sum of the MSE for X1 and X2 D = D1 + D2 = E[(X1 − Xˆ1 )2 ] + E[(X2 − Xˆ2 )2 ].

2.2

(2.4)

Analysis

Since the two channels are identical we will do the analysis for the first channel only, but it is obvious that the same results hold for the second channel. To simplify the expressions a bit, N = 2b is defined as the number of codewords. The encoder q1 is a mapping from x1 ∈ R to a b bit codeword with index i1 ∈ {1, 2, . . . , N }. To define this mapping we let A1i1 be a set containing all points x1 that should be mapped to codeword i1 , where both of the indices 1 in A1i1 denotes the channel. By doing this the mapping can be expressed as x1 ∈ A1i1 ⇒ q1 (x1 ) = i1

∀i1 ∈ {1, 2, . . . , N }.

(2.5)

In a similar way the decoder g1 is a mapping from a pair of received codewords ˆ 1 ∈ R1 where R1 = {r1 , r1 , . . . , r1 } is the (j1 , j2 ) to a real valued point X 11 12 NN set of reconstruction points for X1 . The decoder can now be defined as xˆ1 = g1 (j1 , j2 ) = rj11 j2

∀(j1 , j2 ) ∈ {1, 2, . . . , N } × {1, 2, . . . , N }. 14

(2.6)

With these definitions the distortion D1 can be written as Z D1 =

fX1 (x1 )

x1

N X

P (j1 | q1 (x1 ))

j1 =1

N X

N X

P (i2 | x1 )

i2 =1

P (j2 | i2 )(x1 − g1 (j1 , j2 ))2 dx1

(2.7)

j2 =1

and D2 , from the perspective of channel 1, becomes Z D2 =

fX1 (x1 )

x1

N X

N X

Z P (j1 | q1 (x1 ))

fX2 (x2 | x1 )

x2

j1 =1

P (j2 | q2 (x2 ))(x2 − g2 (j1 , j2 ))2 dx2 dx1 .

(2.8)

j2 =1

In these equations, fX1 (x1 ) denotes the pdf of X1 , P (i2 | x1 ) denotes the conditional probability that x2 is encoded to i2 given x1 and fX2 (x2 | x1 ) denotes the conditional pdf of X2 given x1 . Analytical expressions for all these can be found in Appendix A. As in the algorithm presented in Section 1.3 we will find the solution by iteratively updating the encoder and then the decoder. The optimal encoder q1 will depend not only on the decoder g1 but also on the encoder q2 and the decoder g2 . Similarly the optimal decoder g1 will depend on both of the encoders q1 and q2 . Because of these interdependencies they will be updated according to the following scheme, q1 , g1 and g2 , q2 and finally g1 and g2 again.

2.2.1

Finding the Optimal Encoder q1

We want to design the optimal encoder q1 in MSE sense, given a fixed encoder q2 and fixed decoders g1 and g2 . Looking at (2.7) and (2.8) the trick is to notice that fX1 (x1 ) is non-negative and therefore it is enough to minimize the MSE for each value of x1 . In other words the objective is to, for each x1 , find the codeword i1 that minimizes the MSE D(x1 , i1 ) = D1 (x1 , i1 ) + D2 (x1 , i1 ) = ˆ 1 )2 | x1 , i1 ] + E[(X2 − X ˆ 2 )2 | x1 , i1 ] = E[(x1 − X 15

(2.9)

where D1 (x1 , i1 ) =

N X

P (j1 | i1 )

j1 =1 N X

D2 (x1 , i1 ) =

P (j2 | i2 )(x1 − g1 (j1 , j2 ))2

(2.10)

Z P (j1 | i1 )

fX2 (x2 | x1 )

x2

j1 =1 N X

P (i2 | x1 )

i2 =1

j2 =1 N X

N X

P (j2 | q2 (x2 ))(x2 − g2 (j1 , j2 ))2 dx2 .

(2.11)

j2 =1

The set A1i1 can now be defined as A1i1 = {x1 : D(x1 , i1 ) ≤ D(x1 , ˜i1 ), ∀˜i1 6= i1 } ∀i1 ∈ {1, 2, . . . , N }.

(2.12)

Unfortunately this inequality is not linear in x1 and there are no indications that the sets A1i1 will be intervals as in the single source case in Section 1.3. Instead of trying to find an analytical expression for the encoder the approach is to do a full search and evaluate the distortion for all values of x1 and all codewords. The encoder is then designed by choosing the codeword that resulted in the lowest distortion for each value of x1 .

2.2.2

Finding the Optimal Decoder g1

Finding the optimal decoder g1 is equivalent to computing the best recon1 1 1 struction points R1 = {r11 , r12 , . . . , rN N }. When the encoders, q1 and q2 , 1 are fixed each reconstruction point rj1 j2 is simply computed as the expected value of x1 given that j1 and j2 are received. These are the optimal reconstruction points when the MSE distortion measure is used [12] and can be stated explicitly as rj11 j2 = E[x1 | j1 , j2 ] = Z N X x1 P (j1 | q1 (x1 )) P (j2 | i2 )P (i2 | x1 )fX1 (x1 ) dx1 =

x1

N X i1 =1

P (j1 | i1 )

i2 =1 N X i2 =1

P (j2 | i2 )

.

Z

Z fX1 (˜ x1 ) x ˜1 ∈A1i 1

fX2 (x2 x2 ∈A2i

| x˜1 ) dx2 d˜ x1

2

The derivation of this expression can be found in Appendix A. 16

(2.13)

2.2.3

Algorithm

To find encoders and decoders, from now on referred to as the system, for a given error probability ε and a given correlation-SNR CSN R the following algorithm is proposed 1. Choose q1 and q2 to be two known initial encoders and compute the optimal decoders g1 and g2 . 2. Set the iteration index k = 0 and the current distortion D(0) = ∞. 3. Set k = k + 1. 4. Find the optimal encoder q1 by using (2.12). 5. Find the optimal decoders g1 and g2 by using (2.13). 6. Find the optimal encoder q2 by using (2.12). 7. Find the optimal decoders g1 and g2 by using (2.13). 8. Evaluate the distortion D(k) for the system. If (D(k−1) −D(k) )/D(k) < δ, where δ > 0, stop the iteration otherwise go to Step 3. As in the case of the Lloyd-Max algorithm the algorithm will result in a locally optimal system that not necessarily is the global optimum. In [7] it was found that a good locally optimal system could be achieved by designing the encoder/decoder pair for a range of values of ε, 0 ≤ ε ≤ εmax . This is done by stepping back and forth between 0 and εmax in steps of ∆ε, and for each ε the algorithm is initialized with the system from the previous value of ε. The new system is kept if it results in a lower distortion than the previous system for this ε. The process of stepping back and forth is repeated until no further improvement is made. The reason this method improves the solutions is because at some points the system has converged to a poor local optimum. But some times when ε is increased (or decreased) the algorithm finds a better local optimum. By using this system as the initial system when stepping back, the poor local optimum is replaced with a new and better system. An illustration of this can be seen in Figure 2.2. The idea of stepping back and forth is incorporated in the following way. Decide the range of values of ε and CSN R for which you want to find a pair of encoders and decoders, for example ε = {0, . . . , εmax } and CSN R = {−∞, . . . , ∞} dB. For each value of CSN R start at ε = 0 and step to εmax and then back to 0. In each point, keep the system that results in the lowest distortion for that point and use this system as initialization for the next 17

(a)

(b)

Figure 2.2: In (a), which shows the SDR after stepping from ε = 0 to ε = 0.1, it is clear that at ε = 0.001 the algorithm found a new better local optimum. In (b) it can be seen how this new local optimum is used to improve the SDR for lower values of ε when stepping back from ε = 0.1 to ε = 0.

value of ε. After the stepping in the ε-direction is done, for each value of ε, step in the CSN R -direction from CSN R = ∞ dB to CSN R = −∞ dB and then back to CSN R = ∞ dB. As before, for each point, keep the system that results in the lowest distortion and use this system as initialization for the next value of CSN R . In the final runs, this process of stepping back and forth was repeated two times with an additional stepping in the ε-direction in the end. The benefits of stepping back and forth are the following 1. Poor local optimum systems are removed to some extent. 2. It reduces the importance of the choice of initialization encoders. 3. It insures that a system A performs better than a system B if εA < εB and CSN R,A > CSN R,B .

2.3

Numerical Results

We have now come to the point where it is time to evaluate the performance of the systems that are generated by the algorithm presented above. The systems that will be tested are systems with b = {2, 3, 4} bit encoders designed for CSN R = {−∞, 10, 20, 30, ∞} dB and ε = {0, 0.001, 0.003, 0.005, 0.01, 0.02, . . . , 0.1}. There are some more parameters that need to be decided in the implementation of the algorithm. For example the step length that should 18

Figure 2.3: Graph showing the performance of a 2 bit system in comparison to the COSQ (dashed) with 2 and 4 bits. It should be pointed out that the COSQ only is tested for 0 ≤ ε ≤ 0.1 with a step length of 0.002 which makes the graph misleading in the point ε = 0.001 since the COSQ is not evaluated in this point.

be used when numerically evaluating all integrations and also the step length of x1 and x2 that is used in the full search to find the encoders q1 and q2 . These were set as follows. All integrations between α and β were evaluated between max(α, −5σ) and min(β, 5σ) with a step size of σ/200, where σ is the standard deviation associated with the pdf that is being integrated. When updating the encoders, the best codeword was searched for and found for 1000 points between −3.5σX and 3.5σX , where σX is the variance of X1 and X2 . As the initial encoders q1 and q2 for all values of CSN R and ε = 0, the encoder from the corresponding Lloyd-Max quantizer was used. The FBC has been used as initial codeword assignment in all cases except for the 2 bit case where the NBC has proven experimentally to perform slightly better. Figure 2.3 shows the performance for a system with b = 2 bits in comparison to the COSQ with b = {2, 4} bits. The reason for this comparison is because when CSN R = −∞ dB the performance should be the same as the COSQ with the same number of bits and when CSN R = ∞ dB, ideally, the performance should be the same as the COSQ with twice as many bits. Indeed, for the 2 bit system this is the case for all values of ε. Results for sys19

tems with 3 bits and 4 bits can be found in Appendix B. When the number of bits is increased the performance gain of the distributed approach is not as good for all values of ε. But as soon as ε ≥ 0.003 the two curves coincide even for the case of 4 bits. It should however be pointed out that the system, that for fully correlated sources and ε = 0, obtains the same performance as the COSQ with twice as many bits is highly structured and therefore not easy to find with a greedy search algorithm. Let us take a deeper look at the 3 bit system for sources with CSN R = 20 dB to see how the tradeoff between quantization distortion and robustness against channel errors works in practice. For ε = 0 it can be seen that the SDR is about 4.5 dB higher than for the COSQ. To understand how this is possible we have to look at the structure of the two encoders q1 and q2 . As can be seen in Figure 2.4(a), some of the codewords are used for more than only one quantization region, for example the codeword i1 = 7 is used for the regions A171 = (−∞, −3.1], A172 = (−2.1, −2.4] and A173 = (−1.3, −1.7]. With information from only one of the channels it would not be possible to distinguish between the different regions but with help from the other channel they can be distinguished because i2 = 2 for the region A171 , i2 = 4 for A172 and i2 ∈ {1, 5, 6} for A173 . In this way the distributed coding is used to decrease the quantization distortion. When the same system is designed for ε = 0.04 the resulting encoders get the structure shown in Figure 2.4(b). In this case the second encoder, q2 , does not use all of the codewords that are available. Instead some codewords that would make the system sensitive to channel errors are removed in order to make the system more robust. For the first encoder, q1 , all codewords are used, so you could say that q1 is used to give low quantization distortion and q2 is used to protect against channel errors.

2.3.1

Channel Mismatch

In a real application it is not likely that the true value of the crossover probability ε of the BSC is known, it may even be varying over time. Therefore it is interesting to study the performance of the system when there is a channel mismatch, that is when the system is designed for εD but the true value of the channel’s crossover probability is εT 6= εD . In general it is found that the SDR quickly decreases as soon as εT > εD and when εT < εD there is a small gain of the SDR compared to the case when εT = εD . An example of this can be seen in Figure 2.5, where a 3 bit system designed for CSN R = 20 dB and different values of εD is tested for different values of εT . 20

(a)

(b) Figure 2.4: Encoder structures for 3 bit systems with CSN R = 20 db and ε = 0 in (a) and ε = 0.04 in (b). The small dots in the background shows a sample distribution of (X1 , X2 ), the dashed lines show the boundaries for the quantization regions and the small crosses mark all different pairs of reconˆ1, X ˆ 2 ). The crosses that seem to be misplaced correspond struction points, (X to combinations of codewords that are unlikely to be received.

21

Figure 2.5: 3 bit system, CSN R = 20 dB, tested for different channel mismatches.

2.3.2

Correlation-SNR Mismatch

The true value of the second design parameter, CSN R , is not very likely to be known either. This parameter is found to be even more critical for the performance of the system. Figure 2.6 illustrates this where 5 different 4 bit systems, designed for a correlation-SNR of −∞, 10, 17, 20 and 30 dB respectively, are tested on two sources where the true correlation-SNR is 17 dB. In this example, even the system designed for no correlation at all performance better than the system designed for CSN R = 20 dB when ε = 0. But when ε is increased, the 10 dB- and the 20 dB-systems have the same performance.

22

Figure 2.6: 4 bit system, designed for different values of the CSN R , tested on two sources where the true CSN R = 17 dB.

23

24

Chapter 3 Reducing Complexity The algorithm presented in Chapter 2 works very well in practice, but the complexity for finding the encoders and the decoders grows exponentially with the number of bits that are used. Because of this problem we will investigate some methods that reduce the complexity which makes it possible to increase the bit rate.

3.1

Multistage Quantization

The first approach is inspired from the field of vector quantization where it is common to use several stages of quantization. That is, instead of doing the quantization in a single step with for example 20 bits, the quantization is divided into two stages where you first use a 10 bit quantizer and then use a second stage with another 10 bits to quantize the error from the first stage. When adapting this method to correlated sources the idea is to use the system from Chapter 2 in the first stage of each channel and use two independent COSQ from Section 1.3.1 in the second stages. The complete system can be seen in Figure 3.1.

3.1.1

Analysis

Let us focus on the first channel with the encoders q11 and q12 and the decoders g11 and g12 . The most important thing to realize is that the value ˆ 11 | X1 ]. A that should be sent to the second stage encoder q12 is X1 − E[X proof of this can be constructed in the following way. First of all, let us group the encoder q11 , the BSC and the decoder g11 into a single unit called Q11 . Let us further assume that we are allowed to add a value A to the output of Q11 where A is a function of X1 as seen in Figure 3.2. For each value of 25

Figure 3.1: Multi stage quantization system.

Figure 3.2: Part of the multi stage system.

X1 we want to find the value of A that minimizes the MSE, this is done by setting the derivative of the MSE to zero and solving for A. ˆ 1 )2 | X1 ] = E[(X1 − X ˆ 11 − A)2 | X1 ] D(X1 ) = E[(X1 − X (3.1) ∂D(X1 ) ˆ 11 − A) | X1 ] = −2(X1 − E[X ˆ 11 | X1 ] − A) = −2E[(X1 − X ∂A ˆ 11 | X1 ] ⇒ A = X1 − E[X (3.2) Algorithm The algorithm for designing the four encoders and the four decoders is as follows 1. Design q11 , q21 , g11 and g21 with the algorithm presented in Section 2.2.3. 2. Run simulations on the system to collect sample data. 26

ˆ 11 | X1 ] and E[X ˆ 21 | X2 ] as a function of X1 and Figure 3.3: Variance of E[X X2 . The values are normalized in such way that the value on the y-axis is the value of the SDR that would be obtained for the corresponding value on the ˆ1 = X ˆ 11 + X1 − E[X ˆ 11 | X1 ]. The system x-axis if X was reconstructed as X that is being analyzed is a 3 bit system designed for CSN R = 20 dB and ε = 0.

3. Use the sample data to optimize the two independent COSQ systems, q12 /g12 and q22 /g22 .

3.1.2

Numerical Results

After only a few simulations it turned out that the variances of the estimates ˆ 11 | X1 ] and E[X ˆ 21 | X2 ] are of the same magnitudes as the MSE most of E[X ˆ 11 | X1 ] of the time. This implies that even though you could send X1 − E[X ˆ without any error and add that value to X11 , the SDR would only increase ˆ 11 | X1 ] and E[X ˆ 21 | X2 ] are plotted slightly. In Figure 3.3 the variances of E[X as a function of X1 and X2 respectively. ˆ 11 | X1 ] are The only exceptions to the large variance of the estimate E[X the cases when either CSN R is very high (∞ dB) or very low (< 5 − 10 dB) and ε is very low. But none of these cases are interesting since a very high CSN R is not very realistic and in the case of a low CSN R , two stand alone COSQ perform better and are simpler to implement. 27

Figure 3.4: System with two signals X1 and X2 that are encoded separately but decoded jointly at a fusion center.

3.2

Dual COSQ with Optimal Decoders

The second approach to reduce the complexity is to use the same system as in Chapter 2, included again in Figure 3.4 for convenience. But instead of iteratively finding the encoders and the decoders, two fixed and identical encoders are chosen from a set of known encoders and then the optimal decoders are computed for these encoders according to (2.13). The encoders generated for the COSQ case are used as the set from which the encoders are chosen from. The inspiration for this approach comes from the observation that, at some point when the crossover probability ε is increased, the correlation of the sources is not used to reduce the quantization distortion but rather to protect the data against errors. For example, looking at Figure B.3 it can be seen that for the 4 bit system with CSN R = 20 dB, this point is around ε = 0.01.

3.2.1

Analysis

The algorithm for finding the encoders is a simple search algorithm that, for each value of ε and CSN R , searches for the encoder that minimizes the MSE. As mentioned before, the search is made among the set of encoders from the COSQ case. To be able to evaluate the MSE, the optimal decoders must be computed as well. This is done in the same way as in Section 2.2.2. The reason for not just picking the COSQ encoder for the corresponding value of ε but instead search among the whole set of COSQ encoders is because, for a specific value of ε, the dual-COSQ approach is expected to have less quantization distortion than the COSQ. This implies that the best encoder to use for the dual-COSQ when ε = εDU AL is expected to come from a COSQ system with εCOSQ ≤ εDU AL . 28

Figure 3.5: Performance of the 4 bit dual-COSQ approach (solid) compared to the system from Chapter 2 (dashed) for CSN R = 10 and 20 dB. The dotted line shows the performance of the single source COSQ.

3.2.2

Numerical Results

As can be seen in Figure 3.5, this approach is inferior to the system presented in Chapter 2. However, for ε > 0.001 and CSN R = 20 dB, it is an improvement compared to the single source COSQ system with around 2 dB.

29

30

Chapter 4 Conclusions 4.1

Comments on the Results

The algorithm presented in Chapter 2 works very well in practice. Somewhat surprisingly the resulting encoders even use the same codeword for several regions so that the quantization distortion is reduced for the case of small ε. For higher values of ε the correlation is instead used for protection against channel errors. If the COSQ is assumed to be a good scalar quantizer then this system seems to be a good scalar quantizer for correlated sources. This is because the systems designed for full correlation have the same performance as the COSQ with twice as many bits. The only exception to this is when ε < 0.003. The complexity is an issue. Of course the implementation of the design algorithm can certainly be done in a more efficient way, but unfortunately that does not change the complexity of the underlying equations. If more sources were added the complexity would increase even more. The multistage system presented in Section 3.1 does not give any significant improvement of the performance. Even if there would have been an improvement there is a substantial increase in complexity to implement such encoding system. The dual-COSQ system presented in Section 3.2 works satisfyingly. The system is of course suboptimal but the number of bits used by the encoder can be increased and a higher SDR may thereby be obtained. To avoid poor performance because of mismatches between the design parameters for the system and the true values of these parameters, ε and CSN R should be chosen with some margins. This proves to be especially important for the CSN R as seen in the example in Section 2.3.2. But also ε should be chosen with care, if there is any doubt whether the channel is 31

error free or not, the system should be designed for ε > 0.

4.2

Future Work

This section contains some ideas for future work and improvements that could be made. First of all, the complexity of the design algorithm could be decreased by only looking at single bit errors for each transmitted codeword. For low values of ε these are most likely to occur. It is also possible that the complexity could be reduced by enforcing some kind of structure to the encoders. For the use in WSNs, the number of possible sources should be increased. One way to do this without increasing the complexity too much is to decode the data from correlated sensors in pairs. The final estimate of the sensor’s reading could then be calculated as a weighted sum of the readings from all pair of sensors, with less weight put on values that seem unreasonable.

32

Appendix A Derivations All derivations are done for the following system

X1 = Y + N1 X2 = Y + N2

(A.1) (A.2)

where Y , N1 and N2 are three independent Gaussian distributed random 2 2 variables with variances σY2 , σN 1 , σN 2 and means mY , mN1 , mN2 respectively.

A.1

pdf of a Gaussian Variable

The pdf, fY (y), of the Gaussian variable Y is defined as

µ

1

fY (y) = p exp 2πσY2 33

¶ (y − mY )2 − . 2σY2

(A.3)

A.2

pdf of the Sum of Two Gaussian Variables

The pdf of the sum of two Gaussian variables, as in the case of X1 and X2 , is calculated as the convolution between their pdf:s ½ ¾ Z ∞ a = x1 − mN1 fX1 (x1 ) = fY (t)fN1 (x1 − t) dt = 2 2 σX = σY2 + σN −∞ 1 ¶ 1 µ Z ∞ 2 2 1 (a − t) (t − mY ) − dt = = q exp − 2 2 2σY 2σN 2 −∞ 1 2π σY2 σN 1 ¡ (a−mY )2 ¢ µ ¶ Z ∞ exp − 2σ2 (t − B)2 X1 q = dt = exp − 2 2 2 2σ σ /σ 2 2 −∞ Y N X 1 1 2π σ σ ¡

Y

N1

¢ ½ ¾ q exp − 1 2 2 2 q = 2πσY σN1 /σX1 = mX1 = mY + mN1 = 2 2π σY2 σN 1 µ ¶ (x1 − mX1 )2 1 exp − . =q 2 2σX 2 1 2πσX 1 (a−mY )2 2 2σX

(A.4) It can be seen in (A.4) that the sum of two Gaussian variables also is a Gaussian distributed variable.

A.3

Conditional pdf of a Gaussian Variable

The derivation of the conditional pdf, fX2 (x2 | x1 ), for the Gaussian variable X2 given that x1 is known is not hard but quite tedious. In the following derivation the most important steps are shown. Z ∞ fX2 (x2 | x1 ) = fY (t | x1 ) fN2 (x2 − t | x1 ) dt | {z } −∞ ½ fY (y | x1 ) =

fN2 (x2 −t)

¾ Bayes0 rule

1 ⇒ fX2 (x2 | x1 ) = fX (x1 ) | 1

= Z



fN (x1 − y)fY (y) fX1 (x1 | y)fY (y) = 1 fX1 (x1 ) fX1 (x1 )

−∞

fY (t)fN1 (x1 − t)fN2 (x2 − t) dt {z } I

34

(A.5)

¶ Z ∞ µ (t − mY )2 (x1 − t − mN1 )2 (x2 − t − mN2 )2 1 q exp − − − dt = I= 2 2 2 2σ 2σ 2σ 2 2 2 −∞ Y N N 1 2 fX1 (x1 )2π 2πσY σN1 σN2 ¾ ½ 6 2 σ 2 x˜1 = x1 − mN1 − mY σ ˜ = σY2 σN 1 N2 = = ˜ t = t − mY x˜2 = x2 − mN2 − mY q ¶ 2 2 exp(˜ x1 /(σY2 + σN )) Z ∞ µ (t˜)2 σY2 + σN 1 1 (˜ x1 − t˜)2 (˜ x2 − t˜)2 √ = exp − 2 − − dt˜ = 2 2 2σY 2σN 2σ 2π σ ˜6 −∞ N 1 2 ½ σ 2 = σN2 1 σN2 2 +σN2 1 σY2 +σN2 2 σY2 ¾ 2 +σ 2 ˜2 σY X N1 = = 2 σY mX˜2 = σ2 +σ2 x˜1 Y N1 q ¶ 2 2 Z ∞ µ σY2 + σN exp(−(˜ x2 − mX˜2 )2 /(σX ˜ 2 )) (t˜ − C)2 1 √ exp − = dt˜ = 6 2 σ2 σ2 +σ2σ˜ σ2 +σ2 σ2 2π σ ˜6 −∞ N1 N2 N1 Y N2 Y q s 2 2 σY2 + σN exp(−(˜ x2 − mX˜2 )2 /(σX ˜ 2 )) σ ˜6 1 √ 2π 2 2 = = 2 2 σN1 σN2 + σN σ 2 + σN σ2 2π σ ˜6 1 Y 2 Y ¶ µ (˜ x2 − mX˜2 )2 1 exp − =q 2 2 2σX ˜2 2πσX ˜ 2

µ

⇒ fX2 (x2 | x1 ) = q

1 2 2πσX ˜

exp



(x2 − mN2 − mY −

2 σY

2 σY 2 +σN

2 2σX ˜

2

1

(x1 − mN1 − mY ))2 ¶

2

(A.6) Once again, the pdf shows that X2 |x1 also is a Gaussian variable.

A.4

P (i2 | x1)

The conditional probability that x2 will be quantized to i2 given that x1 is known is simply computed by an integration of the conditional pdf. Z P (i2 | x1 ) =

x2 ∈A2i

2

35

fX2 (x2 | x1 ) dx2

(A.7)

A.5

Reconstruction Points

The reconstruction point rj11 j2 is computed as the expected value of x1 given that j1 and j2 are received. Z 1 x1 fX1 (x1 | j1 , j2 ) dx1 rj1 j2 = E[x1 | j1 , j2 ] = (A.8) x1 ½ ¾ P (j1 , j2 | x1 )fX1 (x1 ) 0 (A.9) fX1 (x1 | j1 , j2 ) = Bayes rule = P (j1 , j2 ) Z Z P (j1 , j2 ) = fX1 (˜ x1 )P (j1 | q1 (˜ x1 )) fX2 (x2 | x˜1 )P (j2 | q2 (x2 )) dx2 d˜ x1 = x1

=

N X

N X

P (j1 | i1 )

i1 =1

x2

Z P (j2 | i2 )

i2 =1

Z

P (j1 , j2 | x1 ) = P (j1 | q1 (x1 ))

Z

fX1 (˜ x1 ) x ˜1 ∈A1i 1

fX2 (x2 x2 ∈A2i

| x˜1 ) dx2 d˜ x1

(A.10)

2

fX2 (x2 | x1 )P (j2 | q2 (x2 )) dx2 =

x2

= P (j1 | q1 (x1 )) Z

N X

x1 P (j1 | q1 (x1 )) ⇒ rj11 j2 =

x1 N X i1 =1

(A.11)

P (j2 | i2 )P (i2 | x1 )

i2 =1

P (j1 | i1 )

N X

P (j2 | i2 )P (i2 | x1 )fX1 (x1 ) dx1

i2 =1 N X

P (j2 | i2 )

i2 =1

Z

Z fX1 (˜ x1 ) x ˜1 ∈A1i 1

fX2 (x2 x2 ∈A2i

| x˜1 ) dx2 d˜ x1

2

(A.12)

36

Appendix B Numerical Results Figure B.1, B.2 and B.3 shows the performance of a system with 2, 3 and 4 bits respectively. The system is optimized for each specific value of ε and CSN R and the performance is evaluated by sending 8000000 simulated samples for each channel. Since the optimization criteria is the sum of the MSE for the two channels, the SDR for the two channels is not exactly the same. The values shown in the graphs are obtained by looking at all the 2 × 8000000 samples as a single signal of 16000000 samples. It should be pointed out that the COSQ only is tested for 0 ≤ ε ≤ 0.1 with a step length of 0.002 which makes the graph misleading in the point ε = 0.001 since the COSQ is not evaluated in this point.

37

Figure B.1: Graph showing the performance of a 2 bit system in comparison to the COSQ (dashed) with 2 and 4 bits.

Figure B.2: Graph showing the performance of a 3 bit system in comparison to the COSQ (dashed) with 3 and 6 bits.

38

Figure B.3: Graph showing the performance of a 4 bit system in comparison to the COSQ (dashed) with 4 and 8 bits.

39

40

Bibliography [1] S. P. Lloyd, “Least Squares Quantization in PCM,” Bell Telephone Laboratories Paper, 1957. [2] S. P. Lloyd, “Least Squares Quantization in PCM,” IEEE Trans. on Information Theory, vol. IT-28, no. 2, pp. 129–137, March 1982. [3] J. Max, “Quantizing for Minimum Distortion,” IRE Trans. on Information Theory, vol. IT-6, pp. 7–12, March 1960. [4] A. Acero X. Huang and H.W. Hon, Spoken Language Processing, Prentice-Hall, Upper Saddle River, New Jersey 07458, USA, 2001. [5] Y. Linde, A. Buzo and R. M. Gray, “An Algorithm for Vector Quantizer Design,” IEEE Trans. on Communications, vol. COM-28, no. 1, pp. 84– 95, January 1980. [6] T. M. Cover and J. A. Thomas, Elements of Information Theory, WileyInterscience, 1991. [7] N. Farvardin and V. Vaishampayan, “Optimal Quantizer Design for Noisy Channels: An Approach to Combined Source-Channel Coding.,” IEEE Trans. on Information Theory, vol. 33, no. 6, pp. 827–838, November 1987. [8] D. Slepian and J. K. Wolf, “Noiseless Coding of Correlated Information Sources.,” IEEE Trans. on Information Theory, vol. IT-19, no. 4, pp. 471–480, July 1973. [9] A. D. Wyner and J. Ziv, “The Rate-Distortion Function for Source Coding with Side Information at the Decoder,” IEEE Trans. on Information Theory, vol. IT-22, no. 1, pp. 1–10, January 1976. [10] S. S. Pradhan, J. Kusuma and K. Ramchandran, “Distributed Compression in a Dense Microsensor Network,” IEEE Trans. on Signal Processing, vol. 19, no. 2, pp. 51–60, March 2002. 41

[11] Z. Xiong, A. D. Liveris and S. Cheng, “Distributed Source Coding for Sensor Networks,” IEEE Trans. on Signal Processing, vol. 21, no. 5, pp. 80–94, September 2004. [12] B. Kleijn, A Basis for Source Coding, KTH (Royal Institue of Technology), 100 44 Stockholm, Sweden, 2004.

42