Distributed Data Aggregation with Geographical Routing ... - CiteSeerX

15 downloads 0 Views 1MB Size Report
Dorottya Vass, Attila Vidács. Dept. of Telecommunications and Media Informatics. Budapest University of Technology and Economics. H-1117 Budapest, Magyar ...
Distributed Data Aggregation with Geographical Routing in Wireless Sensor Networks Dorottya Vass, Attila Vid´acs Dept. of Telecommunications and Media Informatics Budapest University of Technology and Economics H-1117 Budapest, Magyar tud´osok krt. 2., Hungary Email: {vass, vidacs}@tmit.bme.hu

Abstract— In wireless sensor networking applications, gathering sensed information and relaying it to the sink node using multi-hop communication in an energy efficient manner is of paramount importance. In this paper we present the idea of using Aggregator Nodes in order to decrease the amount of packets sent, hence reducing the energy required for communication. We introduce DDAP, a self-organizing Distributed Data Aggregation Protocol that uses randomly chosen Aggregator Nodes (ANs). We propose an extension of GOAFR, a robust geographical routing algorithm, collaborating with DDAP called Geographical Routing with Aggregation Nodes (GRAN). Simulation results evaluating the performance of DDAP and GRAN are presented. We show that by using DDAP and GRAN significant reduction in data traffic can be achieved, resulting in power saving and thus network lifetime prolongation.

I. I NTRODUCTION Wireless sensor networks constitute an emerging technology that has received significant attention from the research community. Sensor networks are typically self-organizing ad-hoc systems that consist of many small, low-cost devices. They monitor the physical environment, and subsequently gather and relay information to one or more sink nodes. Typically, the radio transmission range of the sensor nodes are typically orders of magnitude smaller than the geographical extent of the entire network. Thus, data needs to be relayed towards the sink node hop-by-hop in a multi-hop manner. The energy consumption of the network can be minimized if the amount of data that needs to be transmitted is also minimized. Since activated sensor nodes often detect some common phenomena, there is likely to be redundancy in the data the activated sources report to a sink node. The idea behind data aggregation is to combine the data coming from different sensor nodes en route, eliminating redundancy, minimizing the number of transmissions and thus saving energy [1]. Thus, routing and data aggregation are strongly interconnected issues. The typical task is to find routes from multiple sources to a single destination that allows efficient in-network filtering of redundant data. There are several papers that consider data aggregation [2], [3], [4] and [5]. Authors of [6] propose a polynomialtime algorithm to solve the maximum lifetime data gathering problem. They assume that the location of the sensors and the base station are fixed and known a priori. Furthermore, each sensor generates data periodically (i.e., time-driven), and

sensor nodes have the ability to transmit packets to any other sensor in the network or directly to the base station. Authors of [7] propose a very simple randomized algorithm for routing information on a grid of sensors. They show that a very simple opportunistic aggregation scheme can result in near-optimum performance. Efficient aggregation trees can be constructed using queries. A query-driven approach called Cougar is proposed in [8] where a loosely-coupled distributed architecture is presented to support both aggregation and more complicated in-network computation through declarative queries. Although data aggregation results in fewer transmissions, it potentially results in greater delays [1]. Therefore, in addition to transmission cost, the fusion cost can (and must) significantly affect routing decisions when data aggregation is involved. Authors of [9] proposed an adaptive routing scheme called Adaptive Fusion Steiner Tree (AFST). The construction of the tree needs information about all the nodes that are participating in the transmission. Our proposed solution differs from the above solutions in the following ways. We propose a self-organizing, Distributed Data Aggregation Protocol called DDAP, and the related geographical routing that uses DDAP. Our solution is simple and distributed, there is no central authority, the nodes decide locally to whom the packet should be handed over. As opposed to [7] our sensor network consists of randomly distributed sensors. In contrast to [8], the proposed algorithm is for the event-driven case, where ”relevant” nodes cannot be identified in advance. As against [6] we only have local information and strictly multi-hop communication with short radio distances. The proposed algorithm in [9] also needs central authority. In our solution it is not necessary to know the whole routing tree. Sensors know only their neighbors, and the routing is based on local information. The rest of the paper is organized as follows. Section II describe the assumed network model. In Section III we present an efficient Distributed Data Aggregation Protocol (DDAP) that reduces the amount of sent messages in the network. We present an extension to the GOAFR routing algorithm, called Geographical Routing with Aggregator Nodes (GRAN), to incorporate the proposed data aggregation scheme. Section IV presents the performance analysis of DDAP using simulations, while Section V concludes the paper.

t

S

X

Fig. 1.

Paths between the active sensors and the sink without aggregation.

S

t X

Fig. 2. Paths between the active sensors and the sink with 5% probability of being aggregator node.

II. N ETWORK SCENARIO We assume an event-driven scenario. Whenever the sensor nodes detect something that is worth reporting, their task is to transmit a message to the sink. Since each node is only able to communicate with neighbors within its radio range rf , the message must be routed towards the sink hop-by-hop in a multi-hop manner. Typically, several sensors report on the same event at a time; their messages are sent in parallel to the sink, making the overall traffic load on relaying nodes significant. A. Geographical routing We considered a distributed routing solution, where there is no central authority to select the end-to-end path and inform the participating nodes about it. It is up to the nodes to decide locally to whom the packet should be handed over. The question is, how to choose the next hop among the neighbors within radio range. We applied the GOAFR routing algorithm [10] that is a variant of the original GFG algorithm [11]. The GOAFR routing combines the greedy and the face algorithms. The greedy algorithm always picks the neighbor closest to the sink to be next node for routing. However, in certain situations it can occur that no neighbor is closer to the sink than the current node. In this case the routing switches to the face algorithm, and passes around the hole on its border. When it is possible, the routing switches back to the greedy algorithm. B. Data aggregation We assume that sensor data can be aggregated en route, and aggregation can potentially take place at any intermediate node along the path, and that multiple input packets can be aggregated into a single output packet. We define the set G of aggregator nodes (ANs). All ANs are ordinary sensor nodes that are in a state of actively aggregating incoming messages (see Section III for more details). III. D ISTRIBUTED DATA AGGREGATION P ROTOCOL (DDAP) Our task is to find an efficient data aggregation method that reduces the number of messages needed to report an event to the sink node. In theory, the minimum number of transmissions required is equal to the number of edges in the minimum Steiner tree in the network which contains all alerted sensors and the sink node. However, finding an optimal aggregation tree in the network requires global information on the nodes and available communication links among them.

One simple approximate solution is to use the shortest path tree (SPT) for data aggregation and transmission. In this data aggregation scheme, each source sends its data to the sink along the shortest path between the two, and overlapping paths are combined to form the aggregation tree [1]. We propose DDAP, a self-organizing, Distributed Data Aggregation Protocol that uses randomization to distribute the data aggregator roles among the sensor nodes in the network. Sensors elect themselves to be local aggregator nodes (ANs) at any given time with a given probability. These ANs then broadcast their status only to their neighbors. The task for all the sensors is then to take notes which of their neighbors act as AN. The optimal number of local aggregator nodes in the system needs to be determined a priori. This will mostly depend on the network topology. A. Geographical routing with DDAP The geographical routing needs to be modified as well to incorporate DDAP. We propose an extension called GRAN, Geographical Routing with Aggregator Nodes, as follows. When looking for the next hop, instead of trying to find the neighbor that is closest to the sink, the task now is to find the neighboring aggregator node that is closest to the sink. If there is no neighboring AN at all, the routing continues as in the original GOAFR algorithm. If there is one (or more) AN(s) within radio range but farther away from the sink than the actual node, the routing switches back to the original GOAFR as well. If there is (at least) one AN that is closer to the sink than the actual node, the message is sent to that AN even if there are neighboring nodes closer to the sink. This routing can result in longer path than the optimal path would be, but the efficiency of opportunistic data aggregation is increased significantly by preferring ANs as relay nodes. IV. S IMULATION RESULTS We simulated the proposed data aggregation protocol (DDAP) using GRAN in MATLAB. Both the sensing range and the maximum communication range of each sensor were fixed to 80 m. We considered a dense network where sensors were distributed randomly but uniformly. The node density was set that for each node the average number of neighbors was 64. A single sink node was placed in the center to collect the data. The events are reported to the sink through multi-hop communication.

200

18

180

16 14

140

Average path length

# packet transmissions

160

120 100 80 60

12 10 8 6

40

4

20

2 0

0 0

0.2

0.4

0.6

0.8

1

0

0.1

0.2

0.3

Probability p

Fig. 3. Number of packet transmissions as a function of probability p of being an AN.

0.4

0.5

0.6

0.7

0.8

0.9

1

Probability p

Fig. 4.

Average path length as a function of probability p of being an AN.

A. Parameter p in DDAP Fig. 1 shows a snapshot of the network with one event being reported. Around the event we marked the circular area containing the sensors that observed that event. All those sensors start to send data to the sink (t) on multi-hop wireless paths that might overlap towards the sink node. Paths are chosen using the GOAFR algorithm, i.e., choosing the shortest path between the alerted sensor and the sink in this case. Even if the shortest paths are chosen independently for all source-sink pairs, the resulting data dissemination tree gives a possibility for in-network data aggregation. As packets approach the sink, paths intersect with certain probability. An opportunistic data aggregation scheme can be performed, if we assume that nodes at path intersection points perform aggregation. Fig. 2 shows a snapshot of the network using DDAP and GRAN, where sensors elect themselves to be ANs independently with the probability p of 0.05. When compared to Fig. 1, the difference between the data dissemination trees is clearly visible. Since GRAN prefers to choose ANs as next hops, paths intersect with higher probability around the sources. We compared the effectiveness of in-network data aggregation with and without DDAP and GRAN. The distance between the sink node and the event was 900 meters, all alerted sensors sent a single packet to the sink. 2000 simulation trials were performed to calculate the average number of packets in a round for the following three strategies: (1) there is no aggregation in the network; (2) there is opportunistic data aggregation in the network, but without using aggregator nodes; (3) there are aggregator nodes in the network using DDAP and GRAN. The probability p of being an aggregator node was 0.08. More than 800 packet transmissions (i.e., the total number of hops in all source-sink paths) happen at network nodes to report a single event. In contrast, less than 200 packets are needed if opportunistic data aggregation is used within the network, resulting in nearly 80% traffic reduction. A further 40% decrease in traffic load relative to the opportunistic aggregation can be achieved if ANs are used.

An important question is to find the optimal number of aggregator nodes (ANs) in the network. According to the proposed DDAP protocol, each network node declares itself to be an AN with probability p, independently from the others. By looking at the two extremes, either p is set to zero (i.e., none of the nodes declares itself an AN), or p is set to one (i.e., all nodes act as ANs), the result is as if there were no DDAP implemented at all, all packets are routed towards the sink on the shortest path, and opportunistic aggregation happens by chance. If probability p of being an AN is too small, the selected paths would be close to the shortest paths, but the chance that the paths intersect is low. Interestingly, the same is true if p is too large. It can easily happen, that—even if the GRAN routing prefers aggregator nodes—there are too many ANs to effectively join the shortest paths running nearparallel to each other. Fig. 3 shows the total number of packets transmitted in a round reporting one event as a function of probability p. (We ran 104 simulations, the error bars show the standard deviation of the averages.) As one can see, applying even a few number of ANs the number of packet transmissions decreases significantly. The optimal value of p is around 0.08, that is, approximately 8% of all network nodes are ANs. If p is higher, the number of packet transmissions increases slowly and nearly linearly. B. Average path length Fig. 4 shows the average path length between the activated sensor node and the sink, using the DDAP and GRAN. The length of the paths were examined as a function of probability p. There is no surprise in that, that using aggregator nodes the path length increases, due to GRAN that favors ANs as next hops instead of using the neighbor closest to the sink that would result in the shortest path. As can be seen from the figure, the average path length increases rapidly as p is larger than zero. In the worst case the average path length is about 30% longer than without using GRAN. Having longer paths but higher aggregation probability that results in fewer packet transmissions is clearly a tradeoff.

sensor node aggregator node

∆ ∆

16 14



# ANs on the path

2∆

18

2∆

12 10 8 6

t 4

Fig. 5.

Delays at ANs

2 0

0

0.2

0.4

0.6

0.8

1

Probability p

C. Aggregation delay An important question is to examine that using ANs how the transmission delays in the network are affected. ANs have to collect the packets for aggregation, that maybe were traversing through other nodes, possibly earlier ANs on the path. We assume that every single packet has a counter C indicating that how many ANs the packet was coming through. ANs calculate the waiting time (Twait ) according to the previous round. If a particular AN has not received any packets that traversed any ANs, i.e., all received counters C were zero, it only waits for a constant time ∆ after it receives the first packet in the round. If within ∆ it receives more packets, it aggregates all packets into a single one. If its timer fires after time ∆ it sends its aggregate to the next hop. On the other hand, if an AN received packets that traversed through one or more ANs previously, it calculates its waiting time as follows. It notes the minimum and the maximum values of C from the last round, and calculates the waiting period after the first arrived packet as Twait = ((Cmax − Cmin ) + 1) · ∆.

(1)

Fig. 5 shows the development of Twait in an illustrative example. The price we pay for using DDAP and GRAN is the increased delay. The delay of the network is proportional to the number of ANs on the transmission path. Fig. 6 shows the average number of ANs on the path as a function of probability p of being an AN. V. C ONCLUSION In this paper we presented the idea of using Aggregator Nodes in multi-hop wireless sensor networks, to facilitate innetwork data aggregation in order to decrease the amount of packets to be relayed towards the sink. We introduced DDAP, a self-organizing Distributed Data Aggregation Protocol that uses randomly chosen Aggregator Nodes (ANs). ANs aggregate all incoming data packets en route. Every single node can volunteer to be an AN with probability p, independently from the others. After introducing DDAP, we proposed an extension of GOAFR, a robust and distributed geographical routing algorithm that collaborate with DDAP, called Geographical Routing with Aggregation Nodes (GRAN). While evaluating the performance of DDAP and GRAN, simulation results have shown that by using our proposed solutions the amount of data

Fig. 6. Average number of ANs on the path as a function of probability p of being an AN.

traffic within the network was reduced by as much as 85% in contrast when there were no aggregation in the network at all. Compared to the opportunistic aggregation DDAP and GRAN reduce the traffic by 40%, which is still considerable. The increased network delay using DDAP was also investigated. As for future work, an interesting question would be to examine an adaptive DDAP, where the probability p of being an Aggregator Node can be set adaptively. R EFERENCES [1] B. Krishnamachari, D. Estrin, and S. B. Wicker, “The impact of data aggregation in wireless sensor networks,” in Proc., 22nd Int. Conf. on Distributed Computing Systems, 2002. [2] B. Krishnamachari, D. Estrin, and S. Wicker, “Modelling data-centric routing in wireless sensor networks,” in Proc., IEEE INFOCOM, 2002. [3] C. Intanagonwiwat, R. Govindan, D. Estrin, J. Heidemann, and F. Silva, “Directed diffusion for wireless sensor networking,” Transactions on Networking, vol. 11, no. 1, pp. 2–16, February 2003. [Online]. Available: citeseer.ist.psu.edu/intanagonwiwat03directed.html [4] N. Shrivastava, C. Buragohain, D. Agrawal, and S. Suri, “Medians and beyond: New aggregation techniques for sensor networks,” in Proc., 2nd Int. Conf. on Embedded Networked Sensor Systems (SenSys), Baltimore, MD, USA, 2004. ¨ [5] H. C¸am, S. Ozdemir, P. Nair, D. Muthuavinashiappan, and H. O. Sanli, “Energy-efficient secure pattern based data aggregation for wireless sensor networks,” Computer Communications (Elsevier), vol. 29, pp. 446–455, 2006. [6] K. Kalpakis, K. Dasgupta, and P. Namjoshi, “Maximum lifetime data gathering and aggregation in wireless sensor networks,” in Proc., IEEE Networks Conference, 2002. [7] M. Enachescu, A. Goel, R. Govindan, and R. Motwani, “Scale-free aggregation in sensor networks,” Theoretical Computer Science, vol. 334, pp. 15 – 29, 2005. [8] Y. Yao and J. Gehrke, “The Cougar approach to in-network query processing in sensor networks,” ACM SIGMOD, vol. 31, no. 3, pp. 9–18, 2002. [9] H. Luo, J. Luo, and Y. Liu, “Energy efficient routing with adaptive data fusion in sensor networks,” in Workshop on Discrete Algothrithms and Methods for Mobile Computing and Communications, Cologne, Germany, 2005. [10] F. Kuhn, R. Wattenhofer, and S. A. Zollinger, “Worst-case optimal and average-case efficient geometric ad-hoc routing,” in Proc., Proceedings of the 4th ACM international symposium on Mobile ad hoc networking computing, 2003, Annapolis, Maryland, USA, 2003, pp. 267–278. [11] P. Bose, P. Morin, I. Stojmenovic, and J. Urrutia, “Routing with guaranteed delivery in ad hoc wireless networks,” Wireless Networks, vol. 7, no. 6, pp. 609–616, 2001.

Suggest Documents