AbstractâWireless sensor networks are expected to be used in many different ..... advantage of the potential slots in
Globecom 2012 - Wireless Networking Symposium
Fault-tolerant Scheduling for Data Collection in Wireless Sensor Networks Liang Zhang1, Qiang Ye2 , Jie Cheng2, Hongbo Jiang1, Yake Wang1, Rui Zhou2, Peng Zhao3 1
Dept. of EIE, Huazhong University of Science and Technology, Wuhan, China 430074 Dept. of CS and IT, University of Prince Edward Island, Charlottetown, PE, Canada C1A 4P3 3 System Software Research Laboratory, Futurewei Technologies, Santa Clara, CA, USA 95050 Email: {zhangliangshuxue}@gmail.com,
[email protected], {jiecheng2009, hongbojiang2004, kay1118}@gmail.com,
[email protected], peng
[email protected] 2
Abstract—Wireless sensor networks are expected to be used in many different applications such as disaster relief, environmental control, and intelligent buildings. In this paper, we focus on a sensor network that collects environment data from all sensor nodes periodically. To gather the sensing data quickly and reliably, the scheduling algorithm should be able to coordinate the data transmissions in the network and react to node/link failures effectively. In this paper, we present an innovative scheduling algorithm, Fault-Tolerant Scheduling for data collection (FTS), that leads to short data collection time and high fault tolerance. Our experimental results show that FTS outperforms the DCSB algorithm and exhibits strong fault-tolerant capabilities. Index Terms—Fault Tolerance, Scheduling, Data Collection, Wireless Sensor Networks.
I. I NTRODUCTION Wireless sensor networks are expected to be used in many different applications such as disaster relief, environmental control, and intelligent buildings [1]. In this paper, we consider a wireless sensor network with a large number of sensor nodes arbitrarily deployed in a finite geographical region. Each sensor node detects the environment and generates some sensing data at regular intervals. The union of all sensing data at a specific timepoint enables a good understanding of the environment at the sampling moment. The task of data collection is to deliver all sensing data to the sink node efficiently and reliably. Namely, the data from all sensor nodes at the sampling moment should be forwarded to the sink as soon as possible. In addition, if some node or link fails, the network should be able to survive the failure effectively. To achieve these goals, a scheduling algorithm should be used to coordinate the data transmissions in the network. Scheduling in sensor networks has been studied over the past years [2]-[10]. Wang et al. showed that the scheduling problem for data collection is NP-hard [2]. Chen et al. discussed the capacity of data collection in arbitrary wireless sensor networks [9]. Choi et al. proposed a fault-tolerant scheduling algorithm in [10]. Zhao et al. developed an innovative scheduling mechanism for continuous data collection with dynamic traffic patterns [3]. However, existing algorithms have not solved the collection time and reliability problem effectively. Most of them still lead to unsatisfactory data collection time. The reliability
problem tends to be ignored although node or link failure occurs frequently in wireless sensor networks. In this paper, we present an innovative scheduling algorithm, Fault-Tolerant Scheduling for data collection (FTS), to solve both the collection time and reliability problem simultaneously. Normally, the FTS algorithm can generate an efficient schedule that leads to a close-to-lower-bound collection time. When network malfunction (such as node or link failure) takes place, the FTS algorithm can be used to survive the malfunction by switching the parent of a sensor node to its backup parent. The rest of this paper is organized as follows. Section II presents our system model. The proposed FTS algorithm and its capacity are discussed in Section III. Section IV includes our simulation results and Section V concludes the paper. II. S YSTEM M ODEL In this paper, we consider a wireless sensor network with n sensor nodes v1 , ...vn and a sink v0 . The antenna of each sensor node is omni-directional and the network is static. We assume that a common wireless channel is shared by all sensor nodes in the network and the transmission rate is fixed at W bits/sec. Furthermore, the size of all data packets in the network is consistently S bits. In our research, time is divided into discrete time slots, each of which is S/W seconds. Hence, a packet can be transmitted from one sensor node to its neighbor within one time slot. Based on these assumptions, a TDMA-based scheduling algorithm is proposed in this paper. We adopt the disk graph model and protocol interference model in this paper. Specifically, we use G(V, E) to denote the topology of the network, where V is the set of all sensor nodes (including the sink) and E is the set of all transmission links. Under the disk graph model, there is an edge between node vi and vj (i.e., vi can send packets to vj ) if and only if ||vi − vj || < r, where ||vi − vj || is the Euclidean distance between vi and vj , r is the transmission range of the sensor nodes. For simplicity, we assume that all sensor nodes have the same transmission range. Under the protocol interference model, vi can send a packet to vj if and only if no node within the interference range R of vj transmits data during the same time slot. Otherwise, collision will take place and the packet will be corrupted. For simplicity, we also assume
5567
2
that all nodes have uniform interference range. Generally, the interference range is greater than or equal to the transmission range. For data collection purposes, a classical Breadth First Tree (BFT) with the sink being the root is constructed. This tree is also called data collection tree, denoted as T. For node vi , we use p(vi ) to denote vi ’s parent. The set of vi ’s child nodes is represented as Ch(vi ). In this paper, a node within the transmission range of vi is called vi ’s neighbor. The set of vi ’s neighbors is denoted as N (vi ). In addition, we use Int(vi ) to represent the set of nodes that are within the interference range of vi . In our research, the following definitions are also adopted.
transmissions will collide. Because it is the maximum set, adding one more node to it will lead to some collision.
Definition 1. Backup Parent Set: The backup parent set of node vi , BP (vi ), contains the nodes that can be used to find an alternative path in order to improve the fault-tolerant performance. It is composed of three subsets: BP 1(vi ), BP 2(vi ), and BP 3(vi ). The nodes in BP 1(vi ) are vi ’s neighbors. They share the same parent that is actually vi ’s grandparent, p(p(vi )). Note that p(vi ) does not belong to BP 1(vi ). Formally, BP 1(vi ) = {vk : vk ∈ N (vi ) and p(vk ) = p(p(vi ))} \{p(vi )} where vk represents a sensor node. The nodes in BP 2(vi ) are vi ’s neighbors. They are also the neighbors of p(p(vi )). However, they do not belong to BP 1(vi ) and p(v! i ) is not part of BP 2(vi )."Formally, BP 2(vi ) = (N (vi ) N (p(p(vi )))) \ (BP 1(vi ) {p(vi )}). The nodes in BP 3(vi ) are vi ’s siblings who have at least a BP1-type or BP2-type backup parent. Formally, ! " BP 3(vi ) = {vk : vk ∈ N (vi ) Ch(p(vi )) and (BP 1(vk ) BP 2(vk )) #= φ}. Hence, BP (vi ) can be defined as:
Definition 6. Data Collection Delay: The data collection delay is defined as the number of time slots required by the sink to acquire a snapshot. The shorter the delay, the better the performance.
BP (vi ) = BP 1(vi )
#
BP 2(vi )
#
BP 3(vi )
(1)
Definition 2. General Competitor Set: The general competitor set of node vi , GCS(vi ), includes the nodes that cannot transmit data during the same time slot as vi . It consists of two subsets: GCS1(vi ) and GCS2(vi ). GCS1(vi ) contains the nodes that could potentially destroy the signal received by vi ’s parent or backup parents. Namely, all the nodes that are within the interference range of vi ’s parent or backup parents are included in GCS1(vi ). "" Formally, GCS1(vi ) = Int(p(vi )) ( k Int(vk ))), vk ∈ BP (vi ). GCS2(vi ) includes the nodes whose parent or backup parents would receive corrupted signals due to the data transmission from vi . Formally, GCS2(vi ) = {vk : p(vk ) ∈ Int(vi ) or bp(vk ) ∈ Int(vi )} where bp(vk ) ∈ BP (vk ) and bp(vk ) is a backup parent of vk . Thus, GCS(vi ) can be defined as: GCS(vi ) = GCS1(vi )
#
GCS2(vi )
(2)
Definition 3. Maximum Non-Interference Set: In a sensor network, the maximum non-interference set (MNIS) contains the maximum number of nodes that can transmit data simultaneously without collision. Basically, the nodes in the MNIS are far enough from each other so that no simultaneous
Definition 4. Potential Slot: For node vi , a potential slot is a time slot that is not currently used, but could potentially be used to send a packet without introducing a collision in the network. Definition 5. Snapshot: In the sensor network under investigation, all sensor nodes detect the environment and send data packets to the sink periodically. A snapshot is the union of the data from all sensor nodes at a specific sampling timepoint. One of the goals of the proposed algorithm is to reduce the time required to acquire a snapshot.
III. FTS A LGORITHM The proposed scheduling algorithm, Fault-Tolerant Scheduling for data collection (FTS), is composed of two components: Pre-Scheduling and Adaptive Scheduling. When a sensor network starts up, Pre-Scheduling is used to find a systemwide schedule for the network. Without network malfunctions, this schedule enables the sink to acquire a snapshot efficiently. When some node or link fails, Adaptive Scheduling will be used to adjust the schedule in order to improve the fault-tolerance performance. Pre-Scheduling is a centralized approach that is based on system-wide information. However, Adaptive Scheduling is a distributed method that only depends on local knowledge to adjust the existing schedule slightly. The details of these two components are presented as follows. A. Pre-Scheduling In our research, we use three different colors, white, grey, and black, to indicate the state of sensor nodes in the network. White nodes still have some packets to transmit. Grey nodes do not have data to send, but their descendants have some packets. Black nodes are already dealt with during the current round. Namely, both these nodes and their descendants have nothing to transmit. To acquire a snapshot, a number of time slots are required. Since there is only one common wireless channel in the network, the sink can receive at most one packet during each time slot. To make the data collection delay as short as possible, Pre-Scheduling attempts to establish a schedule with which a packet is sent to the sink during each time slot whenever possible. To achieve this goal, three sub-algorithms are used in Pre-Scheduling. During each time slot, the node that can send its packet to the sink using the least number of time slots is called the ”start point”. The Finding Start Point algorithm is used to find the start point for each time slot. Of course, there could be multiple nodes that use the same number of slots to transmit their data to the sink. Namely, there could be multiple start
5568
3
Algorithm 1 Finding Start Point
Algorithm 3 Pre-Scheduling
Input: network G, data collection tree T , and white color set W HIT E. Output: start point u. 1: Divide all nodes into layers L1 , L2 , . . . , Ll . 2: for i ← 1 to l do 3: for ∀u ∈ Li do 4: if u ∈ W HIT E then 5: Return u 6: Break; 7: end if 8: end for 9: end for
Input: network G, data collection tree T . Output: data collection schedule S. 1: BLACK ← ∅ 2: GREY ← ∅ 3: W HIT E ← V 4: while BLACK %= V do 5: u ← F inding StartP oint(G, T, W HIT E) 6: T S ← F inding M N IS(G, T, u, W HIT E) 7: Add time slot for T S to S 8: for ∀vi ∈ T S do 9: x(vi ) ← x(vi ) − 1 10: x(p(vi )) ← x(p(vi )) + 1 11: if x(vi ) = 0 and ∀vj ∈ D(vi ), vj ∈ BLACK then 12: Add vi to BLACK 13: end if 14: if x(vi ) = 0 and ∃vj ∈ D(vi ), vj %∈ BLACK then 15: Add vi to GREY 16: end if 17: end for 18: Go to next time slot 19: end while
point candidates. In this case, only one of them is selected as the start point. The details are presented in Algorithm 1. Once the start point for a time slot is selected, it is scheduled to send a packet to the sink (or to its parent if it is not a neighbor of the sink) during that slot. Note that, during the time slot, other nodes should also forward their packets to their parents as long as they do not collide with the start point. Actually, Pre-Scheduling uses the algorithm ”Finding MNIS” to find the MNIS for the slot. When the MNIS for the slot is finalized, all the nodes in the MNIS can send their packets during the same slot without collision. The MNIS for the slot is also called the ”transmission set” TS in this paper. The details are described in Algorithm 2.
collection tree. Initially, all nodes are white because each of them has a data packet. Fig. 1(b)-(l) represent the state of the data collection tree during different time slots. VLQN
Algorithm 2 Finding MNIS
Input: network G, data collection tree T , start point u, and white color set W HIT E. Output: transmission set T S. 1: S ← ∅. 2: Add u into S 3: for i ← 1 to l do 4: for ∀vk ∈ Li do 5: if vk ∈ W HIT E and vk %∈ S then 6: if ∀vj ∈ S, vk %∈ GCS(vj ) then 7: Add vk into S 8: end if 9: end if 10: end for 11: end for
At the beginning of the sampling interval, each sensor node generates a data packet. The goal of Pre-Scheduling is to find an efficient schedule with which data packets can be forwarded to the sink using the least number of time slots. The Pre-Scheduling algorithm achieves the goal by choosing a proper transmission set for each time slot. Specifically, during each time slot, the algorithm first selects the start point, then finds out the MNIS, finally updates the data collection tree and proceeds to the next round. The details are presented in Algorithm 3. Note that we use x(vi ) to denote the number of the packets at vi and D(i) to represent the set of vi ’s descendants. Fig. 1 includes a simple network that can be used to illustrate how Pre-Scheduling works. Note that the solid lines in the figure represent the edges in the data collection tree while the dotted lines correspond to potential edges (for clarify, only the potential edges connecting sensor nodes to their backup parents are included). Fig. 1(a) shows the initial data
D
H
L
E
I
M
Fig. 1.
F
J
N
G
K
O
Pre-Scheduling Example
The Pre-Scheduling algorithm first generates a transmission set for the first time slot. In this example, node 1 is selected as the start point for the first slot. Since GCS(vi )={2, 3, 4, 5, 6, 7} (due to space limitations, the detailed interference relationship information is not included in this paper), the MNIS for the slot is {1, 8}. Namely, only node 1 and 8 are allowed to send their packets to their parents during the slot. In addition, the following updates are processed in order to find the transmission set for the second time slot. First of all, the number of packets at node 1 and 8 are reduced by 1. Thus, both x(1) and x(8) become 0. Since the descendants of node 1, node 4 and 5, still have packets, the color of node 1 should be changed from white to grey. However, because node 8 has no descendants, its color should be switched from white to
5569
4
black. Secondly, the number of packets at the parents of node 1 and 8 should be increased by 1. Thus, x(3) becomes 2. The state of the data collection tree at this moment is shown in Fig. 1(b). For the following time slots, the Pre-Scheduling algorithm generates other transmission sets and updates the data collection tree correspondingly. The details of the process are not included in this paper due to space limitations. The state of the data collection tree for these slots are shown in Fig. 1(c)-(l). Overall, 11 time slots are required to acquire a snapshot in this example network. TABLE I S CHEDULE G ENERATED BY P RE - SCHEDULING node 1
10210 21000 0
node 2
01000 00210 0
node 3
20101 00002 1
node 4
00100 00000 0
node 5
00000 10000 0
node 6
00000 00100 0
node 7
00000 00001 0
node 8
10000 00000 0
The details are described in Algorithm 4. Here, we assume that each node u maintains the following information: • u’s unique ID. • u’s backup parent sets: BP 1(u), BP 2(u), and BP 3(u). • The time slots used by u to send and receive packets from node u (the packets actually originate from node u or its descendants), denoted as S(u). • The time slots used by p(u) to send and receive packets, denoted as Sp (u). • The potential slots of the nodes in GCS(u), denoted as PSL. Algorithm 4 Switching Parent
In the sensor network under investigation, a sensor node can be in one of the following states: sleep, transmit, and receive. We use 0, 1, and 2 to denote these three states respectively. With this notation, we can use a series of digits to indicate the activities of a sensor node during the period used to acquire a snapshot. Each digit represents the state of the sensor node during one time slot. For instance, the series of digits corresponding to node 8 during the first acquiring period is 10000000000. This means that node 8 sends out a packet during the first slot (this is indicated by ”1”) and remains silent during the remaining slots (this is denoted by ten ”0” thereafter). In this manner, we can use a set of digit series corresponding to all sensor nodes to denote the schedule generated by Pre-Scheduling. The final schedule for the example network is summarized in Table I. B. Adaptive Scheduling Adaptive Scheduling is a distributed algorithm that only requires local knowledge. The nodes suffering from node or link failure can run the algorithm locally in order to restore data communication. The basic idea of Adaptive Scheduling is that whenever the link between a sensor node and its parent stops working, the sensor node runs the algorithm to switch its parent to one of its backup parents. Since the parent of the node is changed, the schedule generated by Pre-Scheduling has to be modified accordingly. Adaptive Scheduling takes advantage of the potential slots in the network to make sure that other nodes can still use the original schedule to transmit their data after the schedule is adjusted. In addition, with the adjusted schedule, the number of time slots used to acquire a snapshot stays unchanged. Namely, the data collection delay does not increase even after the schedule has been modified using Adaptive Scheduling. In Adaptive Scheduling, the Switching Parent algorithm is used to select a proper backup parent in order to resume the interrupted communiation. The algorithm not only switches the parent of a sensor node, but also adjusts the schedule slightly.
1: Sort the nodes in BP 1(u), BP 2(u), and BP 3(u) according to the increasing order of node degree. 2: if ∃vj ∈ BP 1(u) and available PSL allows the switch then 3: p(u) ← vj 4: Add original Sp (u) to S(vj ) 5: BP 1(u) ← (BP 1(u) \ {vj }) 6: else 7: if ∃vk ∈ BP 2(u) where vk is a leaf node, and available PSL allows the switch then 8: p(vk ) ← p(p(u)) 9: p(u) ← vk 10: Add original Sp (u) to S(vk ) 11: Add original Sp (vk ) to S(p(vk )) 12: BP 2(u) ← (BP 2(u) \ {vk }) 13: end if 14: else ! 15: if ∃vl ∈ BP 3(u), ∃vn ∈ (BP 1(vl ) BP 2(vl )), and available PSL allows the switch then 16: p(u) ← vl 17: p(vl ) ← vn 18: Add original Sp (u) to S(vl ) 19: Add original Sp (vl ) to S(vn ) 20: BP 3(u) ← (BP 3(u) \ {vl }) 21: end if 22: end if
We assume that a sensor node needs to receive an ACK from its parent in order to confirm the receipt of the previous data packet. A ”No ACK” event indicates that either the parent node fails or the link from the sensor node to its parent stops working, which should trigger the Switching Parent algorithm. The Adaptive Scheduling algorithm uses the ”NO ACK” event as the trigger to get itself started. The details of the algorithm are presented in Algorithm 5. Algorithm 5 Adaptive Scheduling 1: Normally, each node sends its data to its parent during the time slots scheduled by the Pre-Scheduling algorithm. 2: When a node cannot receive the expected ACK from its parent, it triggers the Switching Parent algorithm and changes its parent to the selected backup parent during a time slot that is normally used to send data packets.
IV. S IMULATION We carried out simulations to study the performance of the proposed FTS algorithm. In our simulations, sensor nodes are randomly distributed in a 100m x 100m area while the sink node is always located at the center. All sensor nodes have the same transmission/interference range. Our experimental
5570
5
1500
0.4 DCSB FTS Lower bound
Affected Node Rate
Data Collection Delay
2000
1000
500 100 150
Fig. 2.
200
250
300 350 400 Number of Nodes
450
0.3
0.2
0.1
0
500
FTS Performance: Data Collection Delay vs. Node Number
Best rate Rate of FTS Worst rate
0
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 Node Failure Rate
Fig. 3.
results indicate that, in terms of data collection delay, the FTS algorithm outperforms the DCSB (Data Collection Scheduling on BFS) algorithm [9], an up-to-date scheduling algorithm for data collection. In addition, the FTS algorithm exhibits very strong fault-tolerant capabilities. A. Data Collection Delay For a sensor network with N sensor nodes, the data collection delay is at least N time slots. This is because there is only one common wireless channel in the network and thus the sink node can receive at most one packet during each time slot. To acquire a snapshot, the sink node needs to receive one data packet from each sensor node. Hence, in the best case, the sink is kept busy receiving packets, resulting in a data collection delay of N time slots. N time slots is actually the lower bound of data collection delay. Fig. 2 shows the performance of the FTS algorithm in terms of data collection delay. Here, the default number of sensor nodes, transmission range, and interference ratio are 250, 15m, and 1 respectively. Fig. 2 indicates that although both the data collection delay of FTS and that of DCSB increase with the number of nodes in the network, the data collection delay of DCSB grows faster. Actually, the data collection delay of FTS is very close to the lower bound and it is approximately 1/3 of the data collection delay of DCSB. B. Fault Tolerance When we studied the fault-tolerant performance of the FTS algorithm, the node number, transmitting range, and interference ratio were set to 400, 15m, and 1 respectively. In our research, the fault tolerance of FTS was investigated by introducing node failures and analyzing how many nodes are affected. The node failure rate introduced to the network was chosen from the range 0%-10%. Fig. 3 shows the fault-tolerant performance of FTS in terms of affected node rate vs. node failure rate. Note that the affected note rate is the ratio of the number of the nodes that cannot forward their data any more to the total number of nodes in the network. We compared the performance of FTS to two extreme scenarios: the ”best rate” and ”worst rate” case. In the best rate case, whenever a node fails, the transmission tree will be reconstructed. This leads to the best fault-tolerant performance at a very expensive cost. In the
Fault-tolerant Performance of FTS
worst rate case, when a node stops working, no measure is taken to alleviate the problem. This results in the worst faulttolerant performance. Our experimental results indicate that as the node failure rate increases from 0% to 10%, the affected node rate of FTS grows slightly. However, the fault-tolerant performance of FTS is very close to the ideal performance, which indicated by the “best rate” curve in Fig. 3. V. C ONCLUSIONS Data collection delay and reliability are two important factors that should be taken into consideration when we design a scheduling algorithm for wireless sensor networks. Existing algorithms have not solved these two problems effectively. In this paper, we propose an innovative scheduling algorithm, FTS, that leads to short data collection delay and high fault tolerance. The performance of FTS is investigated through simulations. Our experimental results show that FTS outperforms the DCSB algorithm and exhibits strong fault-tolerant capabilities. R EFERENCES [1] Y. Sankarasubramaniam I. F. Akyildiz, W. Su and E. Cayirci. A survey on sensor networks. IEEE Communications Magazine, 40:102C114, August 2002. [2] Ju Wang, Hongsik Choi, and Esther A. Hughes. Scheduling on sensor hybrid network. In Proceedings of IEEE International Conference on Computer Communications and Networks (ICCCN), 2005. [3] Wenbo Zhao and Xueyan Tang. Scheduling data collection with dynamic traffic patterns in wireless sensor networks. In IEEE INFOCOM, 2011. [4] Shashidhar Gandham, Ying Zhang, and Qingfeng Huang. Distributed minimal time convergecast scheduling in wireless sensor networks. In Proceedings of the 26th IEEE International Conference on Distributed Computing Systems (ICDCS), 2006. [5] Ying Zhang, Shashidhar Gandham, and Qingfeng Huang. Distributed minimal time convergecast scheduling for small or sparse data sources. In 28th IEEE International Real-Time Systems Symposium, 2007. [6] Scott C.-H. Huang, Peng-Jun Wan, and Chinh T. Vu. Nearly constant approximation for data aggregation scheduling in wireless sensor networks. In IEEE INFOCOM, 2007. [7] Bo Yu, Jianzhong Li, and Yingshu Li. Distributed data aggregation scheduling in wireless sensor networks. In IEEE INFOCOM, 2009. [8] Yanwei Wu, Xiang-Yang Li, and YunHao Liu. Energy-efficient wake-up scheduling for data collection and aggregation. IEEE Transactions on Parallel and Distributed Systems, 2010. [9] Siyuan Chen, Shaojie Tang, and Minsu Huang. Capacity of data collection in arbitrary wireless sensor networks. In IEEE INFOCOM, 2010. [10] Jungeun Choi, Joosun Hahn, and Rhan ha. A fault-tolerant adaptive node scheduling scheme for wireless sensor networks. Information Science and Engineering, 25:273–287, 2009.
5571