Correlated data gathering in wireless sensor networks ...

2 downloads 124 Views 351KB Size Report
and Computer Engineering, Florida Institute of Technology, Melbourne, USA. ... are deployed along the river, and the sink is located in the monitoring centre ...
Int. J. Sensor Networks, Vol. 4, Nos. 1/2, 2008

13

Correlated data gathering in wireless sensor networks based on distributed source coding Guogang Hua Department of Electrical and Computer Engineering, Florida Institute of Technology, Melbourne, FL 32901, USA E-mail: [email protected]

Chang Wen Chen* Department of Computer Science and Engineering, State University of New York at Buffalo, Buffalo, NY 14260, USA E-mail: [email protected] *Corresponding author Abstract: We propose in this paper a novel scheme for correlated data gathering in energy- and bandwidth-limited wireless sensor networks based on Distributed Source Coding (DSC). We develop a special Viterbi Algorithm, denoted as VA-DSC, for decoding of the sensor data encoded by DSC. DSC principles have recently been applied to sensor data gathering by constructing practical DSC schemes using channel coding approach. However, existing schemes have not yet taken into account the inherent difference between source coding and channel coding. In this proposed algorithm, we take advantage of the known parity bits at the decoder when the data is encoded by DSC. When the proposed algorithm is applied to Recursive Systematic Convolutional (RSC) and Turbo codes, we demonstrate that VA-DSC is able to reduce both decoding error probability and computational complexity. When the proposed algorithm is applied to correlated data gathering in wireless sensor networks, we demonstrate that VA-DSC is also capable of receiving all data correctly, while, at the same time, reducing the energy consumption in the networks. Our simulation results show that the proposed scheme results in superior performance in terms of data reception accuracy and energy consumption efficiency. Keywords: Distributed Source Coding; DSC; Viterbi Algorithm; VA; convolutional code; turbo code; wireless sensor networks; data aggregation. Reference to this paper should be made as follows: Hua, G. and Chen, C.W. (2008) ‘Correlated data gathering in wireless sensor networks based on distributed source coding’, Int. J. Sensor Networks, Vol. 4, Nos. 1/2, pp.13–22. Biographical notes: Guogang Hua is pursuing his PhD degree at the Department of Electrical and Computer Engineering, Florida Institute of Technology, Melbourne, USA. He received his BS and MS degrees from University of Science and Technology of China. His research interests include sensor networks, wireless communication, video coding, multimedia signal processing and communication, etc. Chang Wen Chen has been Professor of Computer Science and Engineering at the State University of New York at Buffalo since January 2008. Previously, he was Allen S. Henry Distinguished Professor of Electrical and Computer Engineering at Florida Institute of Technology. He received his PhD degree from the University of Illinois at Urbana-Champaign. He is a Fellow of IEEE and SPIE. He is serving as Editor-in-Chief for IEEE Transactions on Circuits and Systems for Video Technology. He has over 200 publications in the areas of image and video coding, processing and analysis, mobile wireless multimedia transmission, secure multimedia communication and wireless sensor networks.

1

Introduction

Wireless sensor networks have been considered by many as a key technology for advancing our future development (Chong and Kumar, 2003). In recent years, research in wireless sensor

Copyright © 2008 Inderscience Enterprises Ltd.

networks has been undergoing a quiet revolution. The vast potential and the significant impact throughout society could quite possibly dwarf many previous milestones in the information revolution. The MIT Technology Review ranked wireless sensor networks, consisting of many tiny, low-power

14

G. Hua and C.W. Chen

and cheap wireless sensors, as the number one emerging technology (Xiong et al., 2004). As a result, wireless sensor networks have attracted more and more attentions from various research communities in science and engineering. Typically, a sensor network consists of one or more ‘sinks’ that gather information of interest from a network of massive Sensor Nodes (SNs). The sensors in the network act as ‘sources’ that detect environmental events usually called the readings of sensors, and send these readings to the sinks. The information of interest could be readings from all sensors or readings from only a subset of the sensors. For example, in a river water quality monitoring system, a number of sensors are deployed along the river, and the sink is located in the monitoring centre which may be far away from the river. All sensors can send their information periodically to the sink or the sink may send queries to all sensors or some sensors in a particular region to gather information of interest. Those sensors receiving queries send back their readings to the sink in response to the queries.

1.1 Energy-efficient data gathering in wireless sensor network The most challenging issue in designing wireless sensor networks is that the key resource, namely energy supply, is very limited. Wireless sensor networks usually consist of battery-operated sensing devices with computing, data processing and communicating components. The lifetime of a sensor network is expected to be several months long and can be as long as several years. However, battery recharging is often extremely difficult or even impossible. Therefore, in order to have a long network lifetime, we need to reduce as much as possible any unnecessary energy consumption in the networks. Most current research has been focused on issues such as energy-efficient Medium Access Control (MAC) and routing protocols, and a large number of protocols (Sohrabi et al., 2000; Kulkarni et al., 2002; Dam and Langendoen, 2003; Lu et al., 2004; Ye et al., 2004) have already been proposed. However, in wireless sensor networks, multiple nodes often collectively perform the sensing task and communicate the sensing results to some common sinks. In many cases, SNs are densely deployed to sense the same physical phenomena. Hence, the information from neighbouring SNs is greatly correlated. This implies that the data can be compressed in the sensor networks by exploring the correlation within a neighbourhood of SNs. This is particularly true because the energy cost for wireless communication is usually much higher than the energy cost for processing. One fundamental difference between the wireless sensor networks and current computer networks is that SNs do not always need to send every bit acquired in the network (sensed or received for forwarding) as long as the data gathering nodes can collectively extract the correct information from the bit stream they received. That is to say, by exploiting the correlation among data, the nodes can process the information they sensed or received before transmitting to their next hops in order to reduce the data flow so as to reduce the energy

consumption for data communication in the network. This leads to an important area of research in sensor networks: information aggregation. This area of research has attracted significant attention recently (Pradhan et al., 2002; Boulis et al., 2003; Chou et al., 2003). One basic idea of information aggregation is to appropriately combine information from different sources to remove the redundancy within data and to reduce the energy for transmission and reception via wireless communication.

1.2 Data gathering via distributed source coding It is straightforward that the nodes can avoid sending many redundant information to Base Station (BS) if SNs can communicate each other. However, data communication between nodes consumes precious energy and bandwidth in wireless sensor networks. In some cases, it may even be impossible for all sensors to communicate with each other due to various physical constraints. Therefore, the big question for gathering correlated data in wireless sensor networks will be: is it possible for a node to reduce the redundancy without actually accessing data from neighbouring nodes? If so, no direct data communication among SNs is necessary in the networks, and both energy and bandwidth can be saved. This will result in prolonged lifetime of the sensor network. The answer to this big question is a resounding yes. The theoretical foundation for exploiting redundancy within correlated data has been established more than 30 years ago by Slepian and Wolf in the now well-known Slepian– Wolf coding theorem (Slepian and Wolf, 1973), published in 1973. This paper constitutes the theoretical base for the lossless compression of data acquired from correlated sources and has been the foundation for DSC for various applications (Pradhan et al., 2002; Puri and Ramchandran, 2002; Chou et al., 2003; Girod et al., 2005). To apply DSC theory to sensor networks, practical construction of DSC is necessary. Most DSC schemes developed recently are based on the duality between source coding and channel coding that the channel coding technologies are applied to construct some DSC schemes. Aaron and Girod (2002) proposed a practical DSC algorithm based on Convolutional and Turbo codings. In this research, the side information was viewed as a corrupted version of the original information. With the received syndrome bits and the demonstrated superior error correction capacities of convolutional and turbo codes, Viterbi Algorithm (VA) was used to reconstruct the difference between original information and side information. However, the authors did not consider the difference between source coding and channel coding. In channel coding, both data bits and parity bits have the chance to be corrupted. Whereas in DSC, only the idea of channel coding is applied to DSC. In this case, we can assume that the parity bits remain unchanged. In the process of decoding, these perfectly available parity bits can be used as useful constraints for VA. We can take advantage of this feature to make the decoding algorithm more effective and more efficient.

Correlated data gathering in wireless sensor networks A variety of schemes have been developed for the application of DSC to data gathering in wireless sensor networks. Pradhan et al. (2002) introduced a framework of using DSC in wireless sensor networks which enables highly effective and efficient compression across a sensor network without the need to establish inter-node communication. However, no in-depth analysis for practical applications has been addressed. Boulis et al. (2003) proposed a distributed estimation algorithm that can be applied to a large class of data aggregation problems in wireless sensor networks. However, this aggregation algorithm for sensor networks has been posed as an energy accuracy trade-off, which means that this algorithm cannot guarantee correct gathering of all data. The error in data gathering may be tolerable for some specific applications, while errors in data gathering are certainly not acceptable for some critical applications in defence and medical fields. More relevant to our work was proposed by Chou et al. (2003). The authors proposed a distributed and adaptive signal processing approach to reducing energy consumption in sensor networks. First, an adaptive correlation-tracking algorithm can continuously track the amount of correlation that exists between SNs. Then, on the basis of the amount of correlation, a DSC algorithm based on tree-based codebook partition was used to code each node’s reading and send to data gathering node. The advantage of this algorithm is that both encoding and decoding algorithms are very simple. The disadvantage is that the probability of decoding error is high. As demonstrated by Aaron and Girod (2002), DSC based on turbo code outperforms DSC based on trellis codes. We also demonstrate with simulation results that when retransmission scheme is applied to ensure correct data gathering, even though the correlation is very high, the simple DSC proposed by Chou et al. (2003) actually consumes more energy in the sensor network than the raw data collection without aggregation.

1.3 Overview of the proposed research The proposed research focuses on developing novel schemes for correlated data gathering in wireless sensor networks based on DSC principles. Although various existing approaches have successfully implemented DSC for wireless sensor network applications, one common problem of current research is the lack of appropriate distinction between source coding and channel coding. When such a distinction is identified and taken into consideration, we shall be able to utilise the known parity bits for a significantly improved decoding performance in terms of both decoding accuracy and complexity. We have made best use of this new knowledge and designed a special constrained Viterbi Algorithm for Distributed Source Coding (VA-DSC) that has been implemented with both convolutional and turbo codes. The known parity bits have been used to reduce the freedom of the state transition in VA. In doing so, we are able to achieve more accurate decoding results because the state transitions are less uncertain, as they have to assume some

15 correct state constrained by parity bits. In the same time, we are able to significantly reduce the complexity of VA, since the search space for the optimal solution becomes much smaller because some states are constrained by the parity bits. We have constructed DSC under such constraints and have applied the central idea of this constrained VA to several type of channel codes, including recursive symmetric convolutional codes and turbo codes. We have also applied the proposed scheme to a special case of wireless sensor networks in which SNs are deployed in chain-type fashion. Our simulation results show that the special constrained VA-DSC has a lower probability of decoding error and a lower computation complexity. The practical DSC scheme we developed based on convolutional and turbo codes for correlated data gathering in wireless sensor networks is able to ensure that all data can be gathered correctly, and at the same time, the energy consumption in the wireless sensor networks can be reduced substantially. The rest of this paper is organised as follows. Section 2 reviews DSC and its practical construction. This is followed by the description of the proposed special VA-DSC in Section 3. Section 4 presents the application of DSC in chain-type sensor network, a special type of wireless sensor networks. Section 5 presents the simulation results to confirm the superior performance of the proposed scheme. Finally, Section 6 concludes this paper with some discussions.

2

Distributed source coding and some practical constructions

2.1 Theory of distributed source coding Consider the communication system as shown in Figure 1. X and Y are correlated information sources; usually Y is called side information. In the encoder side, a switch K is set. When K switches to ‘on’ position, it means the encoder can access Y . Otherwise, when K switches to ‘off’ position, the encoder cannot access Y . We assume that decoder can always access Y . It is well known that if encoder can access Y , X can be encoded at a rate of H ( X | Y ) , where

H ( X | Y ) = −∑ PY ( y )∑ PX ( x | y ) log 2 PX ( x | y ) y

(1)

x

is often interpreted as the ‘uncertainty’ remaining in the random variable X , given the observation of Y . H (X | Y)

Figure 1

A communication system with side information

16

G. Hua and C.W. Chen

In 1973, Slepian and Wolf presented a surprising result (Slepian and Wolf, 1973) which states that if X and Y are two correlated discrete alphabet random variables, X can be encoded without access to Y at the encoder with the same rate without losing any compression performance comparing with the case that X is encoded with access to Y . At first, this may seem incomprehensible. Here is a simple example to show how it works. Suppose X and Y are correlated integer numbers from 0 to 7 such that the difference between them is no greater than 1. Without compression, three bits are needed to encode X . However, if the decoder can access Y , then the encoder can group the eight numbers into the following four groups (cosets): Coset-0 {0, 4} Coset-2 {2, 6}

Coset-1 {1, 5} Coset-3 {3, 7}

Without access to Y , one only needs two bits to send the coset index to the decoder, and the decoder will know what X is, based on the index and the side information Y . For example, if coset index 0 was sent to decoder and the decoder knows what Y is, say 1, then the decoder can know that the received must be 0. This is because it is from Coset0, it can only be either 0 or 4; but if it is 4, the difference between X and Y will be 3, which is greater than 1, so it is impossible that X is 4, so it must be 0. The above results were established only for discrete random variables. In 1976, Wyner and Ziv extended the results of Slepian and Wolf (1973) to lossy distributed compression by proving that under certain conditions (Wyner and Ziv, 1976), there are no performance degradations for lossy compression with side information available only at the decoder as compared to lossy compression with side information available at both encoder and decoder. The results established by Slepian and Wolf (1973) and Wyner and Ziv (1976) are only theoretical results without practical implementations. More recently, Pradhan and Ramchandran (1999) and Aaron and Girod (2002) developed practical constructions for distributed compression in an attempt to achieve the bounds predicted by Slepian and Wolf (1973) and Wyner and Ziv (1976). Distributed compression has several inherent advantages in compressing data in the wireless sensor networks. First, distributed compression does not need to exchange data among sensors when compressing their data and therefore reduces the data flow in networks. Second, with distributed compression, the data processing tasks are distributed in the whole networks instead of a few particular nodes. Such distributed processing will prolong the lifetime of the wireless sensor network. When data processing is centralised on a few nodes, those nodes may deplete their energy more quickly than others. Therefore, distributed compression scheme for data aggregation in wireless sensor networks is able to prolong the lifetime of the entire network.

2.2 Practical construction of distributed source coding Most practical constructions of DSC proposed so far are based on channel coding principles. As we know, side

information Y is highly correlated with X , so we can view the side information Y as an output of X being transmitted through a channel and corrupted by the channel noise. The ultimate task for us is to correct the corrupted Y in order to obtain X . Wyner (1974) first suggested the use of linear channel codes as a constructive approach for DSC. The ‘Distributed Source Coding Using Syndromes’ (DISCUS), proposed by Pradhan and Ramchandran (1999), introduced a new constructive and practical framework for DSC based on trellis-structured codes. Error control coding has the ability to correct transmitted signal corrupted by channel noise. A DSC based on convolutional codes shown in Figure 2 was proposed by Aaron (2001). In this case, the encoder is an N / (N+1). Recursive Systematic Convolutional (RSC) encoder. Systematic convolutional code means that the output of the encoder includes the exact input bits and some parity bits. For example, for a 2/3 systematic convolutional encoder as shown in Figure 3, if the input is 00, the output should be 000 or 001, the first two bits are exactly the same as input, while the last bit is the parity bit. In this case, the system will only need to transmit the parity bits, so the compression ratio is 1/ N . In the decoder, the corresponding bits from the side information Y are inserted into the received parity bits to reconstruct the corrupted version of the encoder output. The VA is used to decode reconstructed bit stream to obtain Xˆ , an estimation of X . Figure 2

Distributed coding using convolutional code

Figure 3

A 2/3 Recursive Systematic Convolutional (RSC) encoder

Another high-performance error control code applied to DSC is turbo code proposed by Aaron and Girod (2002). Turbo codes have been demonstrated to have superior error-correction capacities. Figure 4 is the implementation of applying turbo code to DSC by Aaron and Girod (2002). Turbo code consists of two parallel concatenated RSC encoders. The two RSC encoders are separated by an interleaver. Only the parity bits of the outputs of the two encoders are transmitted. Then the compression ratio here is 2 / N . At the decoder, the side information Y is used to reconstruct the ‘channel outputs’. The outputs are fed into the turbo decoder, which consists of two Soft-Input Soft-Output (SISO) constituent decoders to decode X . The authors demonstrated that DSC system based on turbo codes performed better than the system based on trellis codes.

Correlated data gathering in wireless sensor networks Figure 4

3

Distributed coding using turbo code

The proposed constrained Viterbi Algorithm for distributed source coding

Viterbi Algorithm, initially developed for digital communication, has become the theoretical basis for a wide range of applications from digital communication to DNA analysis and speech recognition. In case of convolutional codes and turbo codes, VA is the main algorithm to achieve optimal decoding. It is well known that VA performs a very efficient Maximum-Likelihood (ML) decoding of finite state signals observed in noise. It achieves this by changing state from the original state to the end state, and forming all possible paths from the beginning to the end. From all possible paths, a path with minimum path cost is selected s the survival path. Figure 5 gives a trellis diagram of a 4-state 2/3 RSC. From all possible trellis paths starting form state S0 and ending at state S0, a survival path (shown in bold line) is found. After finding the survival path, the algorithm traces back from the survival path to obtain the encoder input at each time t , then combines all the input from time 0 to time T to obtain the estimation of the input sequence. Figure 5

17

Trellis diagram of a 4-state 2/3 RSC

In general VA, the state transition covers all possible state change. However, there is a possibility that the algorithm chooses a wrong survival path and introduces even more errors during decoding. In many practical channel coding constructions, we may know that some of the transition is impossible because of certain constraints. We can apply such knowledge to eliminate some impossible paths. Once successfully applied, the probability of selecting a wrong path can be significantly decreased. Based on this observation, we propose a constrained VA-DSC as applied to the correlated data gathering in wireless sensor networks.

In channel decoding algorithm, all bits received at the decoder may potentially be corrupted by the channel noise. The traditional VA is based on this assumption and therefore has to consider all possible state transitions. However, since DSC is actually a source coding algorithm, for the purpose of source decoding, the parity bits at the decoded side can be assumed remaining the same as the parity bits at the encoder side. Hence, in DSC, the possible ‘corrupted’ bits are only those systematic bits, and therefore, they are uncertain; the parity bits are exactly known in the decoder, and therefore, they are certain. The known parity bits offer an opportunity for us to apply them as constraints for the state transition while implementing VA. Based on the comparison of DSC and channel coding as well as previous analysis on general VA, a constrained VA-DSC has been developed in this research to take full advantage of the certainty of the parity bits. For any N /( N + 1) RSC codes, from each state at time t , there are 2N possible state transitions to time t+1. Each possible transition will generate a parity bit 0 or 1. If we know the parity bit at time t, the known parity bit offers a constraint on the state transition. That is to say, at time t, of all 2N transitions, those that generate a parity bit different from the received parity bit at time t are impossible transitions. In this case, the possible state transitions will be reduced to 2 N −1 , instead of 2 N . For RSC encoder as shown in Figure 3, there are 2 bits input each time, so from any states, the trellis diagram has 4 (22) possible state transitions to the next time. An example of state transition in traditional VA is shown in Figure 6. Figure 7 shows an example of state transition of the 2/3 RSC coder if the parity bits are known. From state S0, if the parity bit is known as 1, then the next state cannot be state S0 or S4. This is because if the next state is S0 or S4, the parity bit should be 0, so the next state can only be S1 or S2. This is the same from state S3 – if the parity bit is known as 0, the next state can only be S4 or S7. There are two benefits to take advantage of known parity bits at decoder in DSC. First, the probability of decoding error can be substantially decreased. In VA-DSC, some of paths can be eliminated and therefore such constraint reduces

18

G. Hua and C.W. Chen

the possibility of selecting the wrong survival path. An extreme example is the 1/2 RSC: if the parity bits are known in decoder, then, from each state, there is only one possible state transition to the next time, that is to say that from time 0 at state S0, there will be only one path to state S0 at the end. If we trace back from this path, we can get exactly the input at the encoder side, so no matter how many systematic bits are corrupted, the original sequence can still be decoded. Unfortunately, if we use 1/2 RSC in distributed coding, there is no compression at all. Therefore, there is no benefit to apply 1/2 RSC to DSC for wireless sensor networks. Figure 6 State transition of the 2/3 RSC coder

Figure 7 State transition of the 2/3 RSC coder if parity bits are known

Here is a simple example which shows that VA-DSC performs better than the traditional VA. Suppose the input sequence for the 2/3 RSC encoder shown in Figure 3 is {00 01 10 11 11 10} After encoding with RSC, the output becomes {000 011 100 111 110 101} The parity bits of the output sequence (the bits with underline {010101}) are transmitted. At decoder, suppose the side information is {00 10 10 11 11 10} The bold indicates the difference between input information and side information. After reconstruction, the sequence {000 101 100 111 110 101} is fed into decoder. If traditional VA is used, the decoded sequence is {00 10 10 11 01 10} There is one additional error that is introduced during the decoding process and a total of 3 error bits are shown in bold. However, if VA-DSC is adopted, the exactly original message sequence can be recovered. The second benefit is that the computation complexity will be reduced. As the analysis presented earlier shows, for an N/(N+1) RSC codes, from each state at time t, there are only 2N-1 instead of 2N possible transitions to the next time. This means that there are only 2N-1 accumulated error metrics that need to be calculated for each state at each time t. In the forward process, the computation is only half of that of the traditional VA. The trace back processes

for both algorithms are the same. However, for VA, most computation is spent in forward process. Therefore, the computation complexity of VA-DSC can be greatly reduced.

4

Distributed source coding in chain-type wireless sensor network

In general, the sensors are deployed in two-dimension or even three-dimension space. However, there exists a special type of sensor network which is one-dimensional. We call this type of network chain-type wireless sensor networks (Chen et al., 2005; Wang and Chen, 2005; Chen and Wang, 2008). In this paper, we focus on the application of proposed algorithm in this special type of wireless sensor network. We will see that it is more straightforward to show the benefits of DSC in this special type of wireless sensor network. However, the proposed scheme can be applied to any general wireless sensor networks for an improved data gathering performance. Chain-type wireless sensor networks are large-scale wireless sensor networks characterised by the elongated chain-type network topology and sparsely distributed SNs. Such applications include water quality and resource monitoring along some major rivers that may extend tens to hundreds of miles in length, marine habitat monitoring along coastal lines, traffic and transportation monitoring along major highways, and so on. A generic three-tier hierarchical architecture was recently proposed, as shown in Figure 8. Tier-1 is common SNs which are grouped into clusters at different strategic sites along an elongated chain-type area. Each cluster has a Cluster Head Node (CHN). CHNs form tier-2. Tier-3 is BSs. SNs perform data sensing task and report to local CHNs. CHNs aggregate the data streams from the related SNs then forward to sink nodes BS. The communication distance within a cluster may range from a few dozen to a few hundred meters, while between clusters that may range from a few hundred to a few thousand meters depending on the applications. Such hierarchical architecture can be expanded to include more tiers according to practical needs. Since the distances between clusters are relatively large, the data form different clusters may not have much correlation. So the data compression mainly focuses on intracluster data gathering. Figure 8 The architecture of chain-type wireless sensor network

A cluster from a chain-type sensor network is showed in Figure 9, where node 0 is CHN, nodes 1,…,N are SNs. Suppose the information from the adjacent nodes and only the information from the adjacent nodes are correlated. That is to say the data from node 0 and node 1 are correlated, but the

Correlated data gathering in wireless sensor networks data from node 0 and node 2 are not correlated. According to Slepian–Wolf coding theory, if X and Y are two correlated sources, even though at the encoder end, X cannot access Y, X can be encoded with a rate of RX = H ( X | Y ) + ε X , where ε X > 0 and can be very small. At the decoding end, with the presence of Y as side information, X can be decoded with arbitrarily small error probability. Figure 9

A cluster from chain-type sensor networks

19 probability, decoding speed. We also evaluate the energy consumption of DSC in chain-type wireless sensor network and compare it with raw data. For DSC, we compare the results of DSC using convolutional and turbo codes based on VA-DSC, DSC using convolutional and turbo codes based on traditional VA and DSC using tree-based partition codes proposed by Pradhan and Ramchandran (1999). For RSC, we use 8-state 2/3 RSC encoder. For turbo code, two 16-state 4/5 RSC encoders are used as the constituent encoder. RSC and constituent codes for turbo codes are illustrated in Table 1 (Divsalar and Pollara, 1995). Table 1

The following procedure shows how DSC works in one cluster, as shown in Figure 9: 1

Node 1 encodes its information X1 at a rate of RX = H ( X 1 | X 0 ) + ε X and sent it to node 0 (CHN). A 1

1

checksum of the original data 2

3

X1

Node 0 (CHN) uses its own information X 0 as side information, with the received syndrome bits from node 1, it tries to decode X1 . After decoding, node 0 calculates checksum of the decoded data. If the checksum is the same as the received checksum, we assume that CHN receives X1 successfully; otherwise, CHN asks node 1 to re-send its sensor data. Node 2 encodes its data at rate RX = H ( X 2 | X 1 ) + ε X , 2

2

This process continues for all notes within a cluster. Therefore, all nodes n(n = 1, 2,… , N ) can encode their information at a rate of RX = H ( X n | X n −1 ) + ε X and n

n

send them to CHN. At CHN, with X n−1 as side information X n can be decoded. With appropriate retransmission protocol, it is guaranteed that CHN receives all nodes’ information correctly. After CHN gathers all information from its cluster, it sends the encoded data with its own data to BS, and then BS can decode all information correctly. The theoretical analysis above demonstrates that all nodes except CHNs can encode their information at a rate of RX n = H ( X n | X n −1 ) + ε X n . In practice, we will try to make the rate as close as possible to the theoretical bound.

5

Codes

Rate

State

Generator

RSC

2/3

8

h0 = 13; h1 = 15; h2 = 17

Constituent codes for turbo

4/5

16

h0 = 23; h1 = 35; h2 = 31; h3 = 37; h4 = 27

is also sent.

and sends it to node 0 through node 1, also the checksum of the original data is sent. At this time, node 0 already has the information of node 1, so node 0 can decode X 2 by using X1 as side information. If an error occurs, it asks node 2 to re-send its sensor data. 4

Codes generator

Experiment results

To verify the proposed algorithm for correlated data gathering scheme based on VA-DSC, we first evaluate VA-DSC and compare it with traditional VA in terms of decoding error

5.1 Decoding bit error in binary sequence To evaluate the decoding error probability, two binary sequences denoted as X and Y , which have symmetric dependencies, are employed. That is P ( X = Y ) = 1 − p and P ( X ≠ Y ) = p , where p is cross-over probability. Therefore, H ( X | Y ) = − p log 2 p − (1 − p) log 2 (1 − p) 4

(2)

For RSC, 10 even-distributed random binary sequences with 3 sequence length L = 1000 are tested. For turbo code, 10 evendistributed random binary sequences with sequence length L = 10,000 are tested. The simulation results of the decoding Bit Error Rate (BER) are shown in Figure 10. From these results, we can see that when the correlation is small, the performance of VA and VA-DSC is similar. For RSC, VADSC is a slightly better than VA. For turbo code, VA is a slightly better than VA-DSC. We also found that RSC performs better than turbo code when the correlation is small. This is because the rate of turbo code is higher (4/5). When the correlation increases, both RSC and turbo codes with VADSC algorithm perform much better than that with the conventional VA algorithm. To evaluate the computational complexity, we test the running time to decode the same number of sequences in different conditions: For RSC codes, we test VA and VADSC algorithms to decode the same 1000 sequences with length L = 1000 for different trace back depths. The running times are shown in Table 2. For turbo codes, we test VA and VA-DSC algorithms to decode the same 10 sequences with length L = 1000 sequences with different iterations; the running times are listed in Table 3. From the two tables, we can find that the running time of VA-DSC algorithm is less than 60% of that of traditional VA algorithm under the same condition. From the previous analysis, we know that in the forward process, the computation of VA-DSC is half of that of traditional VA algorithm and that the forward process consumes most of computation time of both algorithms. The experimental results validate the analysis.

20

G. Hua and C.W. Chen

Table 2

Running time of RSC Codes with VA and VA-DSC

Traceback depth

L/16

L/8

L/4

L/2

L

TVA-DSC (s)

0.21

0.41

0.80

1.57

3.38

TVA (s)

0.37

0.70

1.40

2.77

5.87

56.76

58.57

57.14

56.68

57.78

TVA-DSC/TVA (%) Table 3

Running time of turbo codes with VA and VA-DSC

Iterations

5

10

15

TVA-DSC (s)

1.92

3.98

5.61

7.50

9.69

TVA (s)

3.25

6.75

9.68

12.84

16.40

59.08

58.96

57.95

58.41

59.09

TVA-DSC/TVA (%)

20

the power consumptions of the radio in transmitting, receiving and sleep modes are 36 mW, 14.4 mW and 15 µW, respectively (RF Monolithics Inc.). Radio bandwidth is assumed to be 20 kbps. To compute the energy consumed by data processing, we set the energy used to encode or decode one bit data as EDA = 5 nJ / bit / signal (Heinzelman et al., 2002). We assume one round of data gathering per minute. The simulation results are based on 5000 rounds of data gathering. Figure 10 Decoding bit error rate performance

25

5.2 Energy efficiency in chain-type wireless sensor network To evaluate the energy efficiency of data aggregation in wireless sensor networks, we simulate the data gathering scheme in one cluster that forms the chain-type wireless sensor network. In our simulation system, N in Figure 9 is set to be 9. That is, there is a total of 10 nodes in each cluster, including CHN. Each node N i (i = 0,1,… ,9) generates a sequence of Independent Identically Distributed (i.i.d.) Gaussian random variables X i1 , X i2 ,…, X i L per minute. Here L is set to be 400. To simulate the data correlation, we set the cluster head’s reading X 0 to be the zero mean unit variance Gaussian random sequence; for the ith (i = 1,… ,9) node, its reading X i is generated by X i = X i −1 + σ Z i

(i = 1, 2,… ,9) .

(3)

Here Zi is another zero mean unit variance Gaussian random sequence and σ is a parameter to adjust the correlations between data sensed by the adjacent nodes. The ratio of variance of X and σ Z is called Correlation-SNR (CSNR) by Pradhan and Ramchandran (1999). Therefore, ⎛ 1 ⎞ CSNR = 10 log10 ⎜ 2 ⎟ ⎝σ ⎠

(4)

Before actually encoding and decoding, the sequences are quantised using 4-level Lloyd–Max scalar quantiser (Aaron and Girod, 2002). A total of 800 bits are generated in each node in one minute. The quantised bit stream is fed into RSC and turbo encoders, and only the parity bits are transmitted. Since DSC based on both RSC and turbo codes cannot guarantee 100% correct decoding, an eight-bit checksum of original quantised information is added in the proposed system at the end of parity bits. We then use the procedure described in Section 4 to gather the data from all nodes in each cluster. In this simulation, we focus on the total energy consumption of the networks with respect to the correctly received total data bits in BS. We use the energy model of Mica Motes to compute the total energy. In this model,

Figure 11 shows the energy efficiency of DSC based on RSC and turbo codes. The energy consumption per bit is obtained by dividing the total energy consumption of all nodes by the total bits successfully received and decoded by CHN. The results show that distributed coding based on RSC and turbo codes is much more energy efficient than the data transmission without coding. We also find that if the simple DSC construction method by Pradhan and Ramchandran (1999) is applied for data gathering, the energy consumption per bit is even more than the raw data transmission case. That is because the decoding error probability of the simple method may become too high for the decoder to handle. In these cases, many packets may have been decoded unsuccessfully, and therefore need retransmission. The retransmission will greatly increase the energy consumption in the network. Figure 11 Energy consumption performance

Correlated data gathering in wireless sensor networks From the simulation results, we can also find that both RSC and turbo code with VA-DSC are more energy efficient than that with traditional VA when the correlation is low. This is because VA-DSC scheme reduces the decoding error probability and therefore the need for retransmission. When the correlation is high, all four methods based on RSC and turbo have similar performance. This is because, when the correlation is high, almost all packets are decoded successfully in the first try. In this case, the energy consumption per bit is only about one-half of that of uncoded data transmission. The results also show that with VA, turbo code performs worse than RSC. One possible reason is due to the interleaving length. In general, turbo codes perform better with larger interleave length. In our simulation, the interleaving length is only 800. Therefore, the advantage of turbo codes has not been taken with short interleaving length.

6

Conclusion and discussion

In this paper, we proposed a novel scheme for correlated data gathering in wireless sensor networks based on DSC principles. We have designed a constrained VA specifically developed for VA-DSC to take the advantage of the known parity bits at the decoder. When the algorithm is applied to convolutional and turbo codes, we are able to reduce both decoding error probability and computation complexity. The analysis as well as the experimental results shows that the computation complexity of VA-DSC algorithm is only about one-half of that of the traditional VA. However, the decoding performance of VA-DSC is much better than that of VA because we make full use of known information. We have also developed practical DSC schemes for data gathering in wireless sensor networks based on convolutional code and turbo code. These schemes are able to ensure that the data can be received correctly and that the energy consumption in the networks can be significantly reduced. The simulation results confirm that the proposed data gathering based on VA-DSC is more energy efficient than the uncoded data transmission. It is also more energy efficient than the existing scheme of DSC using tree-base partition code (Pradhan and Ramchandran, 1999). We have also applied the proposed DSC scheme to data gathering in chain-type wireless sensor networks. Although the results are based on a specific correlation model for chain-type sensor networks, the principle can be easily extended to general wireless sensor network architecture, in particular, the well-known LEACH architecture. Another extension of the proposed algorithm can be in the application of this scheme to sensor data whose correlation varies with the nodes. In this case, we can adopt the RateCompatible Punctured Convolutional (RCPC) codes for VA-DSC to construct an adaptive data aggregation scheme in wireless sensor networks. When the correlation is high, we use even less bits to encode the data. When the correlation is low, we will use more bits to encode the data. The extension with RCPC and recursive data gathering will be reported in a separate paper.

21

References Aaron, A. (2001) Distributed source coding. Available online at: http://www.stanford.edu/%7Eamaaron/dsc/ index.htm Aaron, A. and Girod, B. (2002) ‘Compression with side information using turbo codes’, Proceedings of the IEEE Data Compression Conference, DCC-2002, April, Snowbird, UT. Boulis, A., Ganeriwal, S. and Srivastava, M.B. (2003) ‘Aggregation in sensor networks: an energy - accuracy tradeoff’, Elsevier Ad-hoc Networks Journal (special issue on sensor network protocols and applications), pp.317–331. Chen, C.W. and Wang, Y. (2008) ‘Chain-type wireless sensor network for monitoring long range infrastructures: Architecture and protocols’, International Journal of Distributed Sensor Networks, Vol. 4, No. 3. Chen, C.W., Wang, Y. and Kostanic, I. (2005) ‘A chain-type wireless sensor network for monitoring long range infrastructures’, Proceedings of the SPIE, Vol. 5778, pp.444–455. Chong, C. and Kumar, S.P. (2003) ‘Sensor networks: evolution, opportunities, and challenges’, Proceedings of the IEEE, August, Vol. 91, No. 8, pp.1247–1256. Chou, J., Petrovic, D. and Ramchandran, K. (2003) ‘A distributed and adaptive signal processing approach to reducing energy consumption in sensor networks’, Proceedings of the IEEE INFOCOM, March, San Francisco, CA. Dam, T.V. and Langendoen, K. (2003) ‘An adaptive energyefficient MAC protocol for wireless sensor networks’, The First ACM Conference on Embedded Networked Sensor Systems (Sensys'03), November, Los Angeles, CA, USA. Divsalar, D. and Pollara, F. (1995) ‘Multiple turbo codes’, Proceedings of the 14th Military Communications Conference, MILCOM, pp.279–285. Girod, B., Aaron, A., Rane, S. and Rebollo-Monedero, D. (2005) ‘Distributed video coding’, Proceedings of the IEEE, Special Issue on Video Coding and Delivery, January, Vol. 93, No. 1, pp.71–83. Heinzelman, W.B., Chandrakasan, A.P. and Balakrishnan, H. (2002) ‘An application-specific protocol architecture for wireless microsensor networks’, IEEE/ACM Transactions on Wireless Communications, October, Vol. 1, No. 4, pp.660–670. Kulkarni, G., Schurgers, C. and Srivastava, M. (2002) ‘Dynamic link labels for energy efficient MAC headers in wireless sensor networks’, Proceedings of IEEE, 12–14 June, Vol. 2. Lu, G., Krishnamachari, B. and Raghavendra, C. (2004) ‘An adaptive energy-efficient and low-latency MAC for data gathering in sensor networks’, The 4th International Workshop on Algorithms for Wireless, Mobile, Ad Hoc and Sensor Networks (WMAN 04), April. Pradhan, S.S., Kusuma, J. and Ramchandran, J. (2002) ‘Distributed compression in a dense microsensor network’, IEEE Signal Processing Magazine, March, Vol. 19, No. 2, pp.51–60. Pradhan, S.S. and Ramchandran, K. (1999) ‘Distributed source coding using syndromes (DISCUS): design and construction’, Proceedings of the Data Compression Conference (DCC), March, Los Alamitos, CA, USA, pp.158–167. Puri, R. and Ramchandran, K. (2002) ‘PRISM: a new robust video coding architecture based on distributed compression principles’, The Allerton Conference on Communication, Control, and Computing, October, Allerton, IL.

22

G. Hua and C.W. Chen

RF Monolithics Inc. ASH Transceiver TR3000 Data Sheet. Available online at: http://www.rfm.com Slepian, D. and Wolf, J.K. (1973) ‘Noiseless encoding of correlated information sources’, IEEE Transactions on Information Theory, July, Vol. IT-19, pp.471–480. Sohrabi, K., Gao, J., Ailawadhi, V. and Pottie, G.J. (2000) ‘Protocols for self-organization of a wireless sensor network’, IEEE Personal Communications, October, Vol. 7, No. 5, pp.16–27. Wang, Y. and Chen, C.W. (2005) ‘An energy-efficient media access control protocol for chain-type wireless sensor networks’, Proceedings of SPIE, Vol. 5819, pp.294–305.

Wyner, A.D. (1974) ‘Recent results in the Shannon theory’, IEEE Transactions on Information Theory, January, Vol. 20, pp.2–10. Wyner, A.D. and Ziv, J. (1976) ‘The rate-distortion function for source coding with side information at the decoder’, IEEE Transactions on Information Theory, January, Vol. IT-22, pp.1–10. Xiong, Z., Liveris, A. and Cheng, S. (2004, September) ‘Distributed source coding for sensor networks,’ IEEE Signal Processing Magazine, Vol. 21, pp.80–94. Ye, W., Heidemann, J. and Estrin, D. (2004) ‘Medium access control with coordinated adaptive sleeping for wireless sensor networks’, IEEE/ACM Transactions on Networks, June, Vol. 12, No. 3, pp.493–506.