Cluster Comput (2011) 14: 27–40 DOI 10.1007/s10586-009-0088-9
Sensor scheduling for p-percent coverage in wireless sensor networks Yingshu Li · Chunyu Ai · Zhipeng Cai · Raheem Beyah
Received: 31 December 2008 / Accepted: 23 April 2009 / Published online: 28 May 2009 © Springer Science+Business Media, LLC 2009
Abstract We study sensor scheduling problems of p-percent coverage in this paper and propose two scheduling algorithms to prolong network lifetime due to the fact that for some applications full coverage is not necessary and different subareas of the monitored area may have different coverage requirements. Centralized p-Percent Coverage Algorithm (CPCA) we proposed is a centralized algorithm which selects the least number of nodes to monitor p-percent of the monitored area. Distributed p-Percent Coverage Protocol (DPCP) we represented is a distributed algorithm which can determine a set of nodes in a distributed manner to cover p-percent of the monitored area. Both of the algorithms can guarantee network connectivity. The simulation results show that our algorithms can remarkably prolong network lifetime, have less than 5% un-required coverage for large networks, and employ nodes fairly for most cases. Keywords Wireless sensor networks · p-coverage · Sensor schedule Y. Li () · C. Ai Computer Science Department, Georgia State University, Suite 1413, 34 Peachtree Street, Atlanta, GA 30303, USA e-mail:
[email protected] C. Ai e-mail:
[email protected] Z. Cai Department of Computing Science, University of Alberta, 221 Athabasca Hall, Edmonton, Alberta T6G 2E8, Canada e-mail:
[email protected] R. Beyah Computer Science Department, Georgia State University, Suite 1451, 34 Peachtree Street, Atlanta, GA 30303, USA e-mail:
[email protected]
1 Introduction Wireless Sensor Networks (WSNs) have attracted much attention in recent years. Many efforts have been spent in this area, and many theoretical research results have been published. Moreover, many practical projects are under development, such as tracking moving objects, real-time traffic monitoring, air quality surveillance, pollution monitoring. However, one common characteristic of these projects is that the sensing devices they are using still are powered by either power cable or large solar cells. In the past 35 years, the capacity of batteries only doubles, which is trivial when compared with the progress on the semiconductor technology. Therefore, in the following years, wireless sensors’ capabilities of computation, storage, and communication may not be limitations. In contrast, how to conserve energy and consume energy more efficiently is still one of the most challenging problems. WSNs are designed to conduct surveillance tasks, such as monitoring an area, several known/unknown targets and so on. Each sensor has a sensing area in which the interested events can be detected. Hence, the sensing area of a WSN is a decisive factor of the quality of surveillance. Intuitively, to obtain the best surveillance performance, the interested area or all targets should be entirely monitored by a WSN continuously such that everything happened can be detected immediately. For some applications with strict monitoring requirements, continuous full coverage is required. Furthermore, they even emphasize that each point in the surveillance area should be monitored by multiple sensors at any time. For example, to survey a battle field, only covering the entire battle field is far from enough. To guarantee higher surveillance sensitivity and reduce the possibility of missing targets, every point in the battle field should be covered by multiple sensors in order to avoid
28
blind points and to lower damage level caused by hostile destroy. On the other hand, there are some scenarios with different surveillance requirements and deployment environments. In those cases, a few observation errors, missing data, communication delay, and even partial failure of the network are tolerable or can be compensated in some way. In essence, network lifetime is their first concern. Therefore, how to prolong network lifetime as much as possible while satisfying the basic surveillance requirements is the main issue. For instance, in humid or rainy seasons, the occurrence possibility of fire in a forest is quite lower than that in dry seasons. Therefore, fully monitoring every inch of the target area is not necessary. We may consider lower the coverage levels for the subregions where fire may not occur while place emphasis on only a few important subregions such that more sensors could be turned off and network lifetime could be prolonged. Consequently, for different monitored area, different coverage level is required. Instead of focusing on full coverage, in this paper, we investigate partial coverage problems. p-Percent Coverage (PPC) problem is a special case of the EPPC problem introduced below. Only p-percent of the target area is necessary to be covered such that with guaranteed surveillance quality network lifetime can be prolonged. Furthermore, for the cases in which some subregions need special concerns, we define the Extended p-Percent Coverage (EPPC) problem. A particular required coverage percentage can be assigned to each subregion of the surveillance area such that the surveillance quality can be finely controlled and unnecessary energy consumption can be ulteriorly reduced. We propose two algorithms to solve the EPPC problem, CPCA and DPCP. The centralized algorithm, CPCA, constructs a subset of sensor nodes to provide necessary coverage in a heuristic way. Coverage contribution and residual energy of each node are the decisive factor to determine whether a node can work in the next turn. Since centralized algorithms may not be able to immediately react to the changes happened when the network is working, we develop DPCP, a distributed algorithm. DPCP also works on turns. In each turn, a subset of nodes are selected according to their worth (Sect. 4.2.2) value to construct the next working set. Traditionally, network lifetime is defined as the period from the beginning of the network to the time the area cannot be fully covered. However, for the EPPC problems, due to the special coverage requirement, a new definition to network lifetime is introduced in Definition 3.1. Our contributions include defining the EPPC problem and the SSPC problem and proposing two algorithms to solve the SSPC problem. Compared with the existing ppercent coverage algorithms, our algorithms have the following improvements:
Cluster Comput (2011) 14: 27–40
1. Our algorithms guarantee that at least p-percent of the target area can be covered and also we give the upper bound of the coverage percentage for our centralized algorithm. 2. Our algorithms have lower computation complexity (CPCA) and communication overhead (DPCP). 3. The connectivity of the network is guaranteed. 4. We provide more flexibility to users by allowing users to specify a different coverage level for each subregion. The rest of this paper is organized as follows. Section 2 surveys the related work. In Sect. 3, we introduce some notations and define the PPC problem, the EPPC problem and the problem of Sensor Scheduling for p-Percent Coverage (SSPC). Section 4 presents our algorithms, CPCA and DPCP. The simulation results are illustrated in Sect. 5 and Sect. 6 concludes this paper.
2 Related work The coverage problem in WSNs have been widely studied in the past years. Existing work on the coverage problem can be classified into three categories, the target coverage problem, the area coverage problem and the breach coverage problem [1]. The target coverage problem, as its name implies, means to monitor (cover) a set of targets with known locations and meanwhile maximize network lifetime. Cardei et al. in [1] designed a Linear Programming algorithm to construct as many as possible subsets of sensors to cover all known targets such that network lifetime can be extended. If targets are covered by multiple sensors, by assigning sensors into mutually exclusive subsets and activating only one subset at a time can conserve energy. To address this problem, in [2], the relationship between the desired QoS requirement and the minimum number of active nodes is analyzed in the randomly deployed sensor networks. Also, the authors proposed a location independent and coverage efficient (LICE) protocol for WSN based on their analytical results. LICE selects the minimum number of active nodes based on their residual energy and previous working status. More importantly, LICE does not need any location information. LICE not only reduces the energy consumption and prolongs network lifetime, but also balances the energy consumption among sensors. In [3, 4], the authors proposed an algorithm to solve the area coverage problem whose main goal is to monitor (cover) an area instead of known targets. Depending on the requirements of the coverage, the targets or the area may need to be k-covered (k ≥ 1). In [5], Sensor Scheduling for the k-Coverage (SSC) problem was addressed and several greedy algorithms (PCL series) were proposed
Cluster Comput (2011) 14: 27–40
to provide k-coverage and maximize network lifetime. In [6], the authors model coverage problem as a decision problem, whose goal is to determine whether each location is sufficiently covered by at least k sensors. Instead of determining the level of coverage of each location, the solutions of this paper are based on checking the perimeter of each sensor’s sensing range. The distributed algorithm they proposed can solve this problem in polynomial-time. In the paper [7], the asymptotic-coverage of a square or disk region by a Poisson or uniform point process is investigated. They studied how the probability of the k-coverage changes with the sensing radius or the number of sensors. The breach coverage problem addressed in [8] is about minimizing the number of uncovered targets. A new notation for information coverage based on accurate estimation was proposed by Wang et al. in [9]. A point is said to be completely information-covered if enough sensors exist to keep the estimation error lower than a predefined threshold. A survey of energy-efficient coverage problems is represented in [10]. The authors introduce various coverage formulations, assumptions, and overview of solutions of coverage problems. Many efforts have been spent on the area coverage problems. Most of them focus on the full coverage problems. According to our best knowledge, there are only a few papers on the p-percent coverage problem or similar topics. In [11], the authors proposed a design of the differential random deployment. By strategically deploying more sensor nodes to the area with higher priority, the differential random deployment can effectively lower the possibility of missing interested events, as compared to the uniform random deployment. In [12], the authors used faces to model the monitored area. Then they proposed a (1 + ln(1 − q)−1 )-approximation algorithm for the case when q-percent of the monitored area is required to be covered. Nevertheless, with a larger q, the performance of their algorithm becomes quite worse. Tan proposed a hexagon-based algorithm to maximize network lifetime in [13]. The monitored area is separated into many small hexagon regions. The deployed sensors are grouped according to which hexagon region they reside. Three rules were created to select an active sensor in each hexagon region in each turn. The network connectivity is guaranteed. As a byproduct, the portion of the covered area is determined by the size of hexagons. However, the value of this portion is not controllable and fluctuates in a range. Meanwhile, the network is uniformly activated approximately which means there is no way to enhance or degrade the coverage level of a certain region. In [14], p-percent coverage problem, which only requires p% of the whole area to be monitored, is investigated. A greedy algorithm, pPCA, is proposed to solve the p-percent coverage problem. A connected p-percent coverage distributed algorithm is proposed. More-
29
over, a distributed algorithm based on Connected Dominating Set, CpPCA-CDS, is represented to address connected p-percent coverage problem which requires connectivity of working nodes. By using CpPCA-CDS the Sensing Void Distance can be bounded by a constant. As a matter of fact, these existing works of p-percent coverage cannot specify different p for different subregions. Compared with the existing work, a more specific definition of network lifetime for the EPPC problem is defined in Definition 3.1, and two proposed algorithms for solving the SSPC problem provide controllable coverage levels for different subregions, have lower computation complexity and communication overhead, and guarantee network connectivity.
3 Preliminaries and problem definition We consider a 2-dimensional area with N sensors deployed and no two sensors are deployed at the same location. For some applications, considering the power constraints and the coverage requirements, not every point in the area need to be monitored continuously, which means it may be not necessary to keep 100% of the target area under surveillance. Assume only p% of A need to be covered. In addition, usually there are some special regions in the target area. For example, a small region containing rare plants or a pond in a forest. Users may want to set a different threshold pj as the coverage percentage for these special regions. This pj may be greater or less than p according to the importance of the region. Figure 1 shows a typical example. A square area to be monitored is divided into 4 × 4 subregions such that users can easily assign different pj to each subregion. The
Fig. 1 Network subregions
30
Cluster Comput (2011) 14: 27–40
Table 1 Notations N
The number of deployed sensors
S
The set of deployed sensors
si
The ith sensor
K
The number of subsets
Sk
The kth sensor subset
A
The area need to be monitored which is separated
Definition 3.2 (Extended p-percent coverage problem) Given N sensors, a 2-dimensional area A which is divided into J subregions and a specific coverage percentage pj for each subregion Aj , find a set of n sensors to cover A such that n pj Aj ≤ Rj,i , ∀j = 1, . . . , J i=1
into J subregions J
The number of subregions
Aj
The jth subregion in A
Aj
The extended jth subregion
L
The network lifetime
Li
The lifetime of si
Dj
The network diameter in the jth subregion
p
The predefined coverage percentage for A
pj
The predefined coverage percentage for Aj
tk
The lifetime of Sk
ti,k
The working time of si in Sk
Ri
The region covered by si
Rj,i
The region in Aj covered by si
Rj,i,k
The region in Aj covered by si in Sk ; if si is not
and meanwhile, n is minimized. The EPPC problem seeks to find a subset to provide required coverage. However, considering the entire network lifetime, we have to take it into account the scheduling of sensors. Therefore, based on the EPPC problem we define the Sensor Scheduling for p-Percent Coverage problem as following. Definition 3.3 (Sensor scheduling for p-percent coverage (SSPC)) Given a 2-dimensional area A and a set of sensors S, Objective: Maximize L Subject to:
in Sk , Rj,i,k ’s area is zero.
L=
K
(1)
tk
k=1
subregion B3 is a special area, so 90% of this region need to be monitored continuously. However, for the subregion D1, obviously, it is not very important, hence monitoring 50% of this region is enough. In this way, users can deploy more sensors in the critical regions and deploy less sensors in unimportant regions such that the equipment cost can be reduced and network lifetime can be prolonged under the same hardware cost. For convenience, we define necessary notations in Table 1. On the other hand, separating A into subregions and assigning different coverage percentage to each subregion is a good strategy for large-scale networks. Therefore, we introduce the Extended p-Percent Coverage (EPPC) problem. Because for the EPPC problem each subregion has different importance, users may assign weight wj for each subregion. Therefore, the area of the subregions, whose coverage requirements can be guaranteed, and their weights are the two factors to determine Network End-time. We define Network End-time as following. Definition 3.1 (Network end-time of EPPC problem) Given an area A which is divided into J subregions and the weight wj assigned to the subregion j , network lifetime ends at the time that the percentage of weighted area of subregions whose coverage can be guaranteed by living nodes is less than the user-input f . (Represented by Formula (4) in Definition 3.2.)
tk = min(Li ),
∀worki,k > 0, ∀i ∈ S, ∀k ∈ K
K (worki,k · tk ) ≤ Li ,
∀i ∈ S
(2) (3)
k=1
J
j =1 Aj · wj · Fj,k J j =1 Aj · wj
N
R
≥ f,
∀j ∈ J, ∀k ∈ K
(4)
i=1 j,i,k 1, if ≥ pj Aj Fj,k = 0, otherwise 1, if si is in Sk worki,k = 0, otherwise
(5)
(6)
Formula (1) refers to network lifetime. Formula (2) deals with working time of Sk . Formula (3) guarantees the total working time of si is less than its lifetime. Formula (4) decides network end-time. Formula (5) determines whether pj is guaranteed in Aj . 4 p-percent coverage scheduling algorithm We assume there are enough sensors deployed into the monitored area A, which means with all n sensors the area can be at least p-percent covered and all subregions’ coverage requirements can be satisfied. All sensors have the same fixed sensing range Rs with different energy levels.
Cluster Comput (2011) 14: 27–40
Fig. 2 Extended subregion
4.1 Centralized algorithm Firstly, we introduce our Centralized p-Percent Coverage Algorithm (CPCA). The basic idea is to construct a node subset C iteratively. Each subregion Aj can be pj -covered by nodes in this subset. During each iteration, a node which covers the most uncovered area in the subregion being considered is added into C. Eventually, a connected node subset is constructed and can pj -cover each subregion Aj . The pseudo-code is shown in Algorithm 1. CS is the set of all subsets which are the final results we want and C is the subset being constructed in the current iteration. From Line 9 to Line 17, in each iteration, CPCA checks whether Aj is pj -percent covered. If not, a new node, which is connected with C, is the farthest neighbor from C, and increases Rj,C (the region in Aj covered by C), is chosen to be added into C. When Aj is done, CPCA tries to find the next subregion Aj to be considered. At Line 26, CPCA tries to find the next subregion by finding an Extended Subregion Aj which is covered by C, yet is not pj -percent covered by C. An Extended Subregion Aj is an extension of Aj , as shown in Fig. 2. Aj actually contains all the nodes covering Aj and is also the maximum possible area covered by the nodes in Aj . After several iterations, if pj cannot be satisfied when all the nodes are included in C, this means pj -percent coverage is impossible for this subregion. Then the status of this subregion is marked as FAILED as shown at Line 21. Depending on the value of f mentioned in Definition 3.1, CPCA may determine the network end-time. CPCA stops when f is reached and returns CS. If there is no UNBUILT subregion, it means all the subregions are either covered by C or of FAILED status. Then, we say the current C is successfully constructed. It is added into CS, nodes’ energy information is updated and another new construction process begins from the very beginning.
31
Algorithm 1 CPCA(pj (1 ≤ j ≤ J , S) 1: Sort the nodes in S in non-increasing order according to their energy levels. 2: Set all subregions’ statuses as UNBUILT 3: CS = null 4: node = S.get(0) 5: Aj = subregion where node resides 6: C.add(node) 7: while |S| > 0 do 8: while Aj is UNBUILT do 9: while Rj,C < pj Aj &|S| > 0 do 10: node = C’s farthest connected neighbor inAj 11: if node = null then 12: break 13: else if node can increase Rj,C then 14: C.add(node) 15: end if 16: S.remove(node) 17: end while 18: if Rj,C ≥ pj Aj then 19: GridStatus[j] = BUILT 20: else 21: GridStatus[j] = FAILED 22: if The area that can never be covered is greater than f A then 23: return CS 24: end if 25: end if 26: Find a new Aj which is covered by C but Aj is not pj -percent covered by C. 27: end while 28: if there is an UNBUILT subregion then 29: Connect(C) 30: Find an Aj which is covered by C but Aj is not pj -percent covered by C. 31: else C has been constructed successfully 32: Update lifetime of the nodes in C. If a node still has energy, add it back into S. 33: CS.add(C), C = null 34: Sort nodes and reset GridStatus as UNBUILT 35: node = S.get(0), j = subregion where node resides 36: C.add(node) 37: end if 38: end while 39: return CS
After Line 27 in Algorithm 1, the node set C may not be able to cover every subregion of the surveillance area. There may exist a subregion Aj such that none of the nodes in C covers Aj . Thus, C will never span to Aj . For solving this problem, we need to actively add more nodes into C
32
such that it can reach Aj later. Algorithm 2 is designed for this purpose. Algorithm 2, Connect, only handles one isolated UNBUILT subregion Aj every time. The purpose is to look for necessary nodes to build a communication path and add them into C such that C can span into Aj by executing the first part of Algorithm 1 (from Line 9 to Line 17) later. At Line 1 in Algorithm 2, an isolated UNBUILT subregion is chosen because it is closer to C than others. Then, based on C, layers of neighboring nodes CN i are constructed until at least one node can cover the subregion Aj or there are no more nodes can be used. The latter means the subregion is really isolated from others. When the node covering Aj is determined, a communication path can be constructed by tracing from CN i back to CN 0 (the original C). Algorithm 2 Connect(C) 1: j = C’s closest UNBUILT subregion 2: i = 0, CN i = C 3: while Rj,CN i = 0 & |CN i | > 0 do 4: i ++ 5: CN i = neighborOf (CN i−1 ) \ i−1 j =0 CN j 6: end while 7: if |CN i | > 0 then Choose a node in Aj with the most energ y among 8: nodes in CN i . 9: Add node into C. 10: while i ≥ 0 do Choose the farthest connected node to C in CN i 11: and add it into C. 12: i −− 13: end while 14: else Aj cannot be reached. 15: GridStatus[j] = FAILED 16: end if 17: return
Theorem 4.1 The time complexity of the CPCA algorithm is O(N (N + γ )). Proof At Line 10 in Algorithm 1, the farthest connected node can be found in linear time, which is O(N ). We divide the whole monitored area into γ grids and the grid is small enough such that there does not exists a grid which is partially covered by any sensor. We assume that γ is bounded. At Line 13, the area covered only by node can be calculated in time O(γ ). The while loop at Line 8 and Line 9 totally result in N times of executing Line 10 and Line 13. Hence, the execution time for this part is O(N (N + γ )). In Algorithm 2, the maximum number of iterations of the while loop from Line 3 to Line 6 is the network diameter O(γ ).
Cluster Comput (2011) 14: 27–40
Fig. 3 3 phases of DPCP (not proportional)
Similarly, the maximum number of iterations of the second while loop is also O(γ ). Hence, the complexity of Algorithm 2 is O(γ ). Therefore, the time complexity of CPCA is O(N (N + γ )).
4.2 Distributed protocol In this section, we design a localized algorithm for p-percent coverage, called Distributed p-Percent Coverage Protocol (DPCP), for providing necessary surveillance quality and maximizing network lifetime. We first state a few assumptions. We assume that each node has a unique ID which is common in most platforms, such as TelosB [15]. We now introduce our distributed p-percent coverage protocol(DPCP). DPCP consists 3 phases. They are Discover, Construct and Connect, respectively, as shown in Fig. 3. Each phase has a fixed length. The dashed line in each phase represents the actual finish time of each subregion. Phase 2 and 3 have to begin at the predefined time despite the fact that their previous phase may finish earlier than their indicated beginning time. The tasks of the three phases are described as following: 1. Discover Discover neighbors. 2. Construct Construct a subset of nodes to p-percent cover each subregion. 3. Connect Connect all subsets together. In each subregion Aj , at the beginning of each turn, the following algorithm would be executed to construct a subset of nodes to pj -cover this subregion. At the very beginning, a node is randomly chosen and set as the first ROOT node where the algorithm begin to run. The DPCP algorithm is described as following: 1. Each node wakes up and sends a HELLO message to all 1-hop neighbors. Every node will stay active for Tdecision time (see Fig. 3) and go to sleep if it is not involved in the next turn. 2. The current ROOT node calculates the current coverage percentage cp of C as described in the Sect. 4.2.1. If cp >= pj , a SUCCESS message including the information of C would be sent to each node in C and go to the Step 5. If not, go to Step 3. 3. The current ROOT node calculates worthi for each node i in Neighbor(ROOT) \ C. The one with the highest worth among Sibling(ROOT) and Neighbor(ROOT) \ C would be chosen as the next ROOT node, ROOT next , and
Cluster Comput (2011) 14: 27–40
added into C. Sibling(ROOT) is given in Step 4, and Neighbor(ROOT) contains all of ROOT’s 1-hop neighbors. For the case that there is no Sibling(ROOT) or node in Neighbor(ROOT), ROOT will trace back to ROOT parent , that is the previous ROOT node. 4. A BE_ROOT message is sent from the current ROOT node to ROOT next with Sibling(ROOT)’s information, which is the node with the second highest worth among neighboring nodes of the previous ROOT node, and the current coverage percentage cp. If the ROOT next is Sibling(ROOT), the sibling information in BE_ROOT message would be invalidated in order to prevent thrashing (shown in Fig. 4) from happening which incurs heavy communication cost. When ROOT next received a BE_ROOT message, it executes this algorithm from Step 2 as a new ROOT node. If the receiver is not ROOT next , it marks the sender as a selected node. 5. Each node in C broadcasts a CONNECT message. This message will be handled as following: Note: In the following description, Ak will be used to represent any neighboring subregion of Aj . Each message has a counter stepCount used to record the hop distance from the source. Initially, stepCount is set to 1. minStep(j ) is used to record the current minimal hops to the Cj of Aj . minStep(j ) is a maximal value initially. (a) If a message sent from Aj is received by a node in C of Aj , it will be dropped. (b) If a message sent from Aj is received by a node in Aj but not in C, the receiver will do: – if stepCount < minStep(j ), add itself into the message as a retransmitter, update minStep(j) with stepCount, increase stepCount of this message by one, and broadcast it – if stepCount ≥ minStep(j ), drop it. (c) If a message sent from Aj is received by a node which is not in Aj and will not be active in the next turn, the receiver will do: – if Aj is not a neighboring subregion of the subregion in which the receiver is, or the message is not from Aj directly, then drop it – if stepCount < minStep(j ), add itself into the message as a retransmitter, update minStep(j ) with stepCount, increase stepCount of this message by one, and broadcast it – if stepCount ≥ minStep(j ), drop it. (d) If a message sent from Aj is received by a node which is not in Aj and will be active in the next turn, the receiver will do: – if Aj is not a neighboring subregion of the subregion in which the receiver is, or the message is not from Aj directly, then drop it
33
Fig. 4 Thrashing
– if stepCount < minStep(j ), add itself into the message as a retransmitter, update minStep(j ) with stepCount, update time-stamp of this message with the receiving time, and keep this message as CMsg(j ) – if stepCount ≥ minStep(j ), drop it. When a node in C of one of Aj ’s neighboring subregions has received m CONNECT messages from Aj , where m is a predefined threshold, it will send CMsg(j ) to all other nodes in this C. Upon receiving this message, the receiver need to immediately stop accepting CONNECT messages coming from Aj even if it has not received m CONNECT messages from Aj , and send its own CMsg(j ) to others. Eventually, each node in C will have multiple CMsg(j ) from other members of C. Only the one with the minimum stepCount and the earliest time-stamp will be chosen and the route information in that message will be used which means all retransmitters of this message will be added into both Cs of Aj and Ak . A CONNECT_PATH message duplicated from the original CONNECT message will be sent to Aj from the original receiver. All nodes contained in the message’s routing information will be notified to be involved in each region’s C. The final receiver in Aj will be the original sender of the corresponding CONNECT message, which is responsible for notifying other nodes in C the new nodes for connecting to Ak . 4.2.1 Localized calculate coverage percentage The energy consumption for radio communication is the major part of energy consumption for wireless sensor nodes. Considering energy efficient issues, we need to reduce the time and the size of messages used to transmit global information. Transmitting the set of all information of nodes in C would be too expensive for WSNs if C contains too many nodes. However, for calculating the area of covered region, C should be known by each new ROOT node in advance. This means the information of each node in C has to
34
Cluster Comput (2011) 14: 27–40
Fig. 6 worth parameters
4.2.2 Calculate worth
account. Intuitively, the area covered only by si is one of the most important criteria. The node which covers the most area should be chosen. But, there are some other factors we need to consider. To calculate covered area, the ROOT node needs to know neighboring information of si . This calculation cannot be solved by using the strategy we mentioned in Sect. 4.2.1. One solution is asking si to do this and return the calculation result to the ROOT node. This would result in higher communication cost because the ROOT node needs to collect results from all of its neighbors. Considering this process will happen at each time a new ROOT node is selected, the overhead is too high. Therefore, the method we use is to choose another parameter to approximately describe the coverage of si . The one we choose is the distance between the ROOT node and si . The larger the distance is, the more si can cover, intuitively. Nevertheless, the distance cannot reflect the fact that a part of the area it represents might be covered by ancestors of the current ROOT node as shown in Fig. 6. In Fig. 6, the dashed circles, s1 and s2 , are candidates. s1 is farther from the ROOT node than s2 and covers more area than does s2 . However, the actual contribution of s1 is less than of s2 because most area covered by s1 is also covered by other nodes in C. Therefore, α is involved into the evaluation, as shown in Fig. 6, and ranges from 0 to π . A larger α means the node is farther from C and can provide more contribution on coverage in most cases. Additionally, the residual energy of each node should also be considered. Finally, we use three parameters to evaluate candidate nodes. worth is calculated by using Formula (7).
d worth(d, α, e) = λ min ,1 Rsi + RROOT e α (7) +μ + π E
DPCP uses worth to evaluate the contribution of each node in order to choose the best candidate to construct C. To evaluate a candidate node si , many factors need to be taken into
d is the distance between ROOT and the candidate node si . E is the initial energy of a sensor, and e is the residual energy of si . λ and μ can be used to tune the weight of each of three parameters in Formula (7). Intuitively, a higher λ
Fig. 5 Localized calculate coverage
be transmitted to each new ROOT node during the execution of DPCP such that the new ROOT node can calculate cp and decide whether it needs to find next ROOT node to cover more area. We assume that the sensing range is less than nodes’ communication range, which is not a strict limitation because the communication range is usually about one hundred meters or more. For instance, the communication range of mica2 [16] is 500 feet (about 150 meters). In contrast, the sensing range is much less than that for most cases, such as light sensors, temperature sensors and humidity sensors. As shown in Fig. 5, solid cycles describe communication range of sensors, and dashed cycles represent sensing range of sensors. S2 and S3 are neighbors of S1, and S1 is the current ROOT node. Since S2 and S3 are in C, the increased covered area by S1 is horizontal strip shadow part but not the whole coverage area of S1. Our idea is to have each node (not only those who will be/is ROOT node) know what happened to their neighbors. When a node received a BE_ROOT message which implies the sender is selected to be one member of C, it would mark this neighbor as a selected node. When it is also selected, it can know which neighbor is in C and calculate how much area is only covered by itself. Since communication range is greater than sensing range, each node just needs to maintain the information of its one hop neighbors. Therefore, cp can be resolved locally and is the only data need to be transmitted to the next ROOT node.
Cluster Comput (2011) 14: 27–40
means with lower possibility of overlapping distance is the first concern. Similarly, users can use a higher μ to shown that straight communication paths are preferred. In sparse networks, overlapping among ancestors and children seldom happens. So, the influence of α can be ignored. Nevertheless, in dense networks, a larger α results in less overlapping which means larger contribution on cp. d (8) d = toleranced α (9) α = toleranceα With Formulas (8) and (9), when the differences among candidates’ d and α are trivial, the residual energy level becomes a crucial factor. Then the worth of each node can be calculated by replacing d and α with results of Formulas (8) and (9) respectively. The toleranced and toleranceα depend on the specific network environment. 4.2.3 Algorithm performance Theorem 4.2 The message complexity of DPCP for a subregion Aj is nj + 6|Cj | + 4, where nj is the number of nodes in Aj and Cj is the constructed working set in Aj . Proof In the first phase of DPCP, all nodes send a HELLO message. The average number of those messages in subregions is nj . In the second phase, nodes are trying to construct a subset of nodes to cover Aj . |Cj | − 1 BE_ROOT messages are needed. Additionally, a messages is needed to send the SUCCESS message. Thus the number of messages sent in this phase is |Cj |. In the last phase, each node in Cj broadcasts a CONNECT message. Then, |Cj | messages are needed. Since, each subregion has 8 neighbors, on average a subregion needs to process received CONNECT messages with 4 subregions. For constructing connection with one neighbor, |Cj | CMsg messages and one CONNECT_PATH message are needed. Then, 5|Cj | + 4 messages are needed in this phase. Therefore, totally nj + 6|Cj | + 4 messages are needed for each subregion Aj .
5 Simulations We developed a Java program to evaluate our algorithms. Three major results are concluded as follows: 1. Compared with full coverage algorithms, CPCA and DPCP can prolong network lifetime, Definition 3.1, remarkably.
35
2. CPCA and DPCP have very good accuracy on the coverage percentage when the subregion size is at least 5 times larger than the sensor’s sensing range. 3. CPCA and DPCP can make use of nodes in different subregions fairly and show good balance on the number of remaining nodes when the network is dead. The simulation environment is defined as follows: The surveillance area is a square and is separated into 3 × 3 subregions. The subregions’ size ranges from twice to 10 times of the sensing range. Thoroughly random deployment is not reasonable for the EPPC problem because a subregion with the lowest required coverage percentage may have the same number of nodes deployed with a subregion with the highest coverage percentage. Therefore, in our simulations, nodes are proportionally deployed into each subregion according to pj . Each node has the same sensing range, 50 m, and the same communication range, 200 m. All simulations are repeated for 50 times. Because each subregion has different coverage requirement, randomly deploying nodes into the whole surveillance area without concerning this difference is not reasonable and obviously will waste many nodes on the subregions with lower coverage requirements. Therefore, Proportional Deployment is introduced to control the node deployment. Node deployment is based on each subregion’s area and required coverage percentage. More area in a subregion to be covered induces more nodes to be deployed in this subregion. 5.1 Network lifetime Firstly, network lifetime is our top concern. We have given the definition of network lifetime for EPPC problems in Definition 3.1. In our simulations, f = 50%. Two major issues are investigated in this sub-section. 5.1.1 Compared with full coverage algorithms, how much improvement can be achieved by applying our p-percent coverage algorithms? To determine the improvement, we compare CPCA and DPCP with GS[5], which is a k-coverage algorithm and generates close-to-optimal solutions. For CPCA, DPCP and GS, using the same nodes deployment to do comparison is not very meaningful because the advantages of three algorithms cannot be shown. Thus, to show the best performance of GS, we set k = 1 and apply Density Control technique which can deploy nodes more reasonably in order to reduce the number of residual nodes when 1-coverage cannot be satisfied. The simulation configuration is shown in Table 2. The simulation results are shown in Fig. 7. In Fig. 7(a), we deployed 800 nodes. GS cannot construct the whole area
36
Cluster Comput (2011) 14: 27–40
Fig. 7 Network lifetime
Table 2 Simulation configuration CPCA and DPCA
GS
Deployment
Proportional deployment
Density control
parameters
f = 50%, λ = 1, α = 1
k=1
p = 0.4
p = 0.8
p0
0.3
0.3
p1
0.8
0.8
p2
0.9
0.9
p3
0.6
0.6
p4
0.2
0.92
p5
0.2
0.92
p6
0.2
0.92
p7
0.2
0.92
p8
0.2
0.92
Std Dev
0.287
0.216
even with all nodes when the size of subregion is larger than four times of Rs . However, with the p-percent coverage, when the size of each subregion is 10 times of the sensing range, CPCA and DPCP can still provide necessary surveillance. This result once again shows that, for some applications without strict surveillance requirements, p-percent coverage is a quite good solution indeed. When SS/Rs = 2 (SS is Subregion Size), network lifetime of CPCA-0.4 (p = 0.4), DPCP-0.4, CPCA-0.8 (p = 0.8) and DPCP-0.8 are 7.6 times, 6.11 times, 4.12 times and 2.75 times longer than that of GS-1, respectively. Additionally, CPCA and DPCP provide double network lifetime when p drops from 0.8 to 0.4, which shows that the performance of our algorithms changes
linearly with the change of p. In Fig. 7(b), we deployed 1600 nodes, and the result shows that it almost always provides double lifetime than does 800 nodes. 5.1.2 How does the size of the subregion affect the algorithms’ performance? Considering CPCA-0.4 as an example, for the cases that SS/Rs = 2, 3, and 4, network lifetime drops sharply. The reason is SS/Rs only shows the linear relationship between the subregion’s size and the sensing range. However, The relationship between the subregion’s area and the sensing range is square. As shown in Fig. 7(a), the area increases 125% when SS/Rs changes from 2 to 3. And the network lifetime of CPCA-0.4 drops from 130 to 95, which is 73% of the original one. Theoretically, network lifetime should drop to 57.8, which is 44% of 130. However, because when SS/Rs = 2, the sensing range is relatively too large for the size of subregions. The overlapped area between nodes in different subregions are quite a lot. Thus, network lifetime is shorter than the theoretical value, and the loss, due to the increase of the area size, is not much. Nevertheless, when SS/Rs changes from 3 to 4, network lifetime drops a lot. Each subregion gets 77.8% more area and the theoretical loss on the network lifetime should be 43.8% which is close to our result, 50.5%. 5.2 Accuracy The goal of CPCA and DPCP is to cover p-percent of the target area. Intuitively, higher actual p means wasting energy, and lower actual p means the coverage requirement is not satisfied. Therefore, the actual coverage percentage should
Cluster Comput (2011) 14: 27–40
37
Fig. 8 Coverage accuracy
be as close as possible to p. Meanwhile, the actual coverage percentage should be not less than p according to the coverage requirements. The simulation configuration is same with the one in Sect. 5.1 except that f = 100% and we only compare CPCA and DPCP with the required coverage percentage. If f = 50%, the global coverage percentage might be lower than 50% before the network is dead. According to the Definition 3.1, this is reasonable. f = 100% shows results more intuitively than does f = 50%. Therefore we set f as 100% instead of 50%. The correctness of the simulation is not affected. Two sets of simulations with 800 nodes and 1600 nodes are executed respectively. The results are shown in Fig. 8. In Fig. 8(a), when SS/Rs = 2 or 3, the actual p is far from the indicated value. The reason is that when the subregion’s size is not much larger than the sensing range one extra node newly added into C can result in a large change on the coverage percentage. The covered area cannot be finely controlled when the subregion size is relatively small comparing with Rs . With the increase of the subregion size, the actual p drops very soon. When SS/Rs is greater than 5, the actual p is very close to the indicated p. DPCP’s results are greater than CPCA’s. The first reason is that, for reducing communication cost, the area in each subregion covered by nodes in neighboring subregions is ignored. Hence, when a ROOT node finds that the indicated p is reached, the actual p is greater actually. The second reason is that, for connecting with other subregions, redundant nodes have to be added into C. Additionally, in fact, DPCP provides more robust connection than does CPCA because the former tries to build connections with all neighboring subregions. However, when the indicated p is relatively
Table 3 Balance simulation configuration Set 1
Set 2
Set 3
Set 4
Set 5
p0
0.4
0.8
0.1
0.1
0.6
p1
0.4
0.8
0.2
0.1
0.6
p2
0.4
0.8
0.3
0.1
0.9
p3
0.4
0.8
0.4
0.1
0.6
p4
0.4
0.8
0.5
1
0.6
p5
0.4
0.8
0.6
1
0.6
p6
0.4
0.8
0.7
1
0.9
p7
0.4
0.8
0.8
1
0.6
p8
0.4
0.8
0.9
1
0.6
Std Dev
0
0
0.274
0.474
0.132
large, the nodes for connection can also provide contribution on covering such that the over-covered area becomes trivial. 5.3 Balance There are always some residual nodes which still have extra energy when the network is dead. We define the balance as the standard deviation of the percentage of residual nodes in each subregion. Obviously, the number of those residual nodes should be as less as possible such that less energy is wasted. The simulation configurations are shown in Table 3. 800 nodes are proportionally deployed. The size of subregions is 2, 4, 6, and 8 times of Rs , respectively. f = 50% and other parameters are same with those in the previous simulations. Five sets of simulations were executed. Sets 1 and 2 show the balance for low p cases and high p cases. Set 3 shows the balance for the networks whose p changes
38
Cluster Comput (2011) 14: 27–40
Fig. 9 Balance
smoothly. Set 4 represents the cases which have extremely different coverage requirements for subregions. Set 5 represents the cases in which most subregions have identical coverage requirements and only a small part of them need special concern. The results are shown in Fig. 9. In Fig. 9(a) and 9(b), Set 4 induces the worse cases for both CPCA and DPCP. It shows that larger differences among subregions’ coverage requirements result in more residual nodes. Sets 1 and 2 show that with similar coverage requirements for subregions nodes are used fairly, and less nodes remains when the network is dead. For Sets 3 and 4 which have higher standard deviation on p, the standard deviations of the residual nodes percentages are much higher than those of the previous two scenarios. The results of Set 5 show that a few subregions with remarkably different p do not affect the number of residual nodes.
Another inference is the number of residual nodes decreases with the increase of the size of subregions. The reason of the existence of residual nodes is that, for the subregions with many residual nodes, the nodes in their neighboring subregions keep provide enough coverage, such that it is not necessary to make use of their own nodes. Due to the decrease of the node density, each subregion has to make well use of their own nodes, and no subregion can benefit from its neighbors. Therefore, the number of residual nodes decreases, as shown in Fig. 9(c) and 9(d). DPCP constructs more robust connectivity with neighboring subregions than does CPCA. Hence, in each subregion, DPCP make use of more nodes to build communication paths. The cases that nodes in a subregion are never or seldom used will not happen in DPCP. In Fig. 9(b), the observation of lower standard deviation of the residual nodes percentage shows that DPCP makes use of nodes in all sub-
Cluster Comput (2011) 14: 27–40
regions more fairly. Figure 9(c) and 9(d) shows the average percentages of residual nodes. Sets 1, 2, and 5 still show better performance that do Sets 3 and 4. As shown in Fig. 9, DPCP achieves better the average percentages of residual nodes than CPCA. In summary, both DPCP and CPCA can prolong network lifetime effectively compared to GS. CPCA can maintain better coverage accuracy than DPCP, and DPCP have better energy balance than CPCA.
6 Conclusion and future work We studied EPPC problem in this paper which prolongs the network lifetime by claiming that for some applications full coverage is not necessary and reasonably decreasing the coverage percentages for different parts of the surveillance area. We then proposed a centralized algorithm named CPCA for solving SSPC problem by selecting the least number of nodes to monitoring p-percent target area. And we also proposed a distributed algorithm named DPCP which can distributed determine a set of nodes to cover p-percent target area and meanwhile guarantee the network connectivity. The simulations shows our algorithms can remarkably prolong network lifetime, have good accuracy and make use of nodes fairly for most common cases.
39
7.
8.
9.
10.
11.
12.
13.
14.
15.
Acknowledgement This work is supported by the NSF under grant No. CCF-0545667 and CCF 0844829. 16.
ternational Conference on Wireless Sensor Networks and Applications, pp. 115–121. ACM, New York (2003) Wan, P.-J., Yi, C.-W.: Coverage by randomly deployed wireless sensor networks. IEEE/ACM Trans. Netw. 14(SI), 2658–2669 (2006) Cheng, M., Ruan, L., Wu, W.: Achieving minimum coverage breach under bandwidth constraints in wireless sensor networks. INFOCOM 2005. 24th Annual Joint Conference of the IEEE Computer and Communications Societies. Proc. IEEE 4, 2638– 2645 (2005) Wang, B., Chua, K.C., Srinvasan, V., Wang, W.: Sensor density for complete information coverage in wireless sensor networks. In: Proceedings of EWSN 2006, February 2006 Cardei, M., Wu, J.: Energy-efficient coverage problems in wireless ad-hoc sensor networks. Comput. Commun. J. 29(4), 413–420 (2006) Xu, K., Hassanein, H., Takahara, G., Wang, Q.: Wsn04-2: Differential random deployment for sensing coverage in wireless sensor networks. In: Global Telecommunications Conference, 2006. GLOBECOM’06, pp. 1–5. IEEE, New York (2006) Berman, P., Calinescu, G., Shah, C., Zelikovsky, A.: Efficient energy management in sensor networks. In: Xiao, Y.P.Y. (ed.) Ad Hoc and Sensor Networks. Wireless Networks and Mobile Computing, vol. 2. Nova Science, New York (2005) Tan, H.: Maximizing network lifetime in energy-constrained wireless sensor network. In: IWCMC’06: Proceedings of the 2006 International Conference on Wireless Communications and Mobile Computing, pp. 1091–1096. ACM, New York (2006) Wu, Y., Ai, C., Gao, Y.L.S.: p-percent coverage in wireless sensor networks. In: International Conference on Wireless Algorithms, Systems and Applications. WASA’08, October 2008 Polastre, J., Szewczyk, R., Culler, D.: Telos: enabling ultra-low power wireless research. In: IPSN’05: Proceedings of the 4th International Symposium on Information Processing in Sensor Networks, pp. 48. IEEE Press, Piscataway (2005) Crossbow, Mica2 datasheet, http://www.xbow.com/Products/ productdetails.aspx?sid=174 (accessed 21 December 2008)
References 1. Cardei, M., Thai, M., Li, Y., Wu, W.: Energy-efficient target coverage in wireless sensor networks. INFOCOM 2005, 24th Annual Joint Conference of the IEEE Computer and Communications Societies. Proc. IEEE 3, 1976–1984 (2005) 2. Chen, J., Yu, F.: A location independent and coverage efficient protocol for wireless sensor networks. In: Integration Technology, 2007, ICIT ’07. IEEE International Conference, pp. 751–755, March 2007 3. Zhou, Z., Das, S., Gupta, H.: Connected k-coverage problem in sensor networks. In: Computer Communications and Networks, 2004, ICCCN 2004. Proceedings, 13th International Conference, pp. 373–378, October 2004 4. Berman, P., Calinescu, G., Shah, C., Zelikovsky, A.: Efficient energy management in sensor networks. In: Ad Hoc and Sensor Networks. Wireless Networks and Mobile Computing, vol. 2. Nova Science, New York (2005) 5. Gao, S., Vu, C., Li, Y.: Sensor scheduling for k-coverage in wireless sensor networks. In: 2nd International Conference on Mobile Ad-hoc and Sensor Networks (2006) 6. Huang, C.-F., Tseng, Y.-C.: The coverage problem in a wireless sensor network. In: WSNA’03: Proceedings of the 2nd ACM In-
Yingshu Li received her Ph.D. and M.S. degrees from the Department of Computer Science and Engineering at University of MinnesotaTwin Cities, in 2005 and 2003. She received her B.S. degree from the Department of Computer Science and Engineering at Beijing Institute of Technology, China, in 2001. Dr. Li is currently an Assistant Professor in the Department of Computer Science at Georgia State University. Her research interests include Optimization in Networks, Wireless Sensor Networks, Wireless Networking and Mobile Computing, Approximation Algorithm Design and Computational Biology. Her research has been supporting by the National Science Foundation (NSF) of US, the National Science Foundation of China (NSFC), the Electronics and Telecommunications Research Institute (ETRI) of Korea, and GSU internal grants.
40
Cluster Comput (2011) 14: 27–40 Chunyu Ai is a Ph.D. student in the Department of Computer Science at the Georgia State University, Atlanta. She received the B.S. degree in computer science from Heilongjiang University, China, in 2001, and the M.S. degree in computer science from Heilongjiang University, China, in 2004. Her current research interests include wireless sensor networks and data stream. She is a student member of the IEEE.
Zhipeng Cai is a Research Scientist in the Department of Computing Science at the University of Alberta. He received his Ph.D. and M.S. degree in Department of Computing Science at the University of Alberta, and B.S. degree from the Department of Computer Science and Engineering at Beijing Institute of Technology. His research areas focus on viruses (such as AIV, HIV-1, HCV, and FMDV) subtype and recombination prediction, whole genome based phylogenetic analysis, cancer bioinformatics through Microarray, bovine genomics (radiation hybrid maps, linkage maps, haplotyping, quantitative trait loci association), tag SNP se-
lection and identification of linked regions through genotype data. His research interests also include computational biology, approximation and randomized algorithms design. Raheem Beyah is an Assistant Professor in the Department of Computer Science at Georgia State University where he leads the Georgia State Communications Assurance and Performance Group (CAP). He is also an Adjunct Professor in the School of Electrical and Computer Engineering at the Georgia Institute of Technology. He received his Bachelor of Science in Electrical Engineering from North Carolina A&T State University in 1998. He received his Masters and Ph.D. in Electrical and Computer Engineering from the Georgia Institute of Technology in 1999 and 2003, respectively. Prior to joining Georgia State, Dr. Beyah was a research faculty member with the Georgia Institute of Technology’s Communications Systems Center (CSC) for four years and remains a part of the Center. He also worked as a consultant in Andersen Consulting’s (now Accenture) Network Solutions group. He is an Associate Editor of the Wiley Security and Communication Networks Journal and serves as Guest Editor for the International Journal of High Performance Computing and Networking. His research interests include network security, wireless networks, network traffic characterization and performance, and security visualization. He is a member of IEEE, ACM, and NSBE.