250
Int. J. Sensor Networks, Vol. 11, No. 4, 2012
A least-movement topology repair algorithm for partitioned wireless sensor-actor networks Ameer Ahmed Abbasi* Department of Computer Engineering, King Fahd University of Petroleum & Minerals, Dhahran-31261, Saudi Arabia Email:
[email protected] *Corresponding author
Mohamed F. Younis Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, Baltimore, Maryland, USA Email:
[email protected]
Uthman A. Baroudi Department of Department of Computer Engineering, King Fahd University of Petroleum & Minerals, Dhahran-31261, Saudi Arabia Email:
[email protected] Abstract: In Wireless Sensor-Actor Networks (WSANs), sensors probe their surroundings and send their data to more capable actor nodes. The actors’ response requires them to coordinate their operation. Therefore, a strongly connected inter-actor topology is necessary and tolerance of an actor failure becomes a design requirement. Autonomous repositioning of actors has been deemed as an effective recovery strategy. In this paper, we present a distributed network recovery scheme called Least-Movement Topology Repair (LeMoToR). To restore connectivity, LeMoToR relies on the local view of a node about the network and strives to relocate the least number of nodes. It also reduces the total travelled distance and overall inter-node communication complexity. LeMoToR do not imposes pre-failure communication overhead and utilises existing path discovery activities in the network to know the structure of the topology. The performance of LeMoToR is validated analytically and through simulation. The validation results demonstrate the effectiveness of LeMoToR. Keywords: wireless sensor-actor networks; fault tolerance systems; topology management; network connectivity restoration. Reference to this paper should be made as follows: Abbasi, A.A, Younis, M. and Baroudi, U.A. (2012) ‘A least-movement topology repair algorithm for partitioned wireless sensor-actor networks’, Int. J. Sensor Networks, Vol. 11, No. 4, pp.250–262. Biographical notes: Ameer Ahmed Abbasi is a PhD student with College of Computer Science and Engineering, King Fahd University of Petroleum and Minerals (KFUPM), Saudi Arabia. He received his BA and MA in Computer Technology from Sheikh Zayed Islamic Centre, University of Karachi, Pakistan, in 1999 and 2000, respectively. In 2011, he received his MS in Computer Engineering from KFUPM. He has several years of teaching and research experience. His research interests include fault tolerance and topology management for mobile, ad-hoc and wireless sensor networks. He has published several technical papers in refereed conferences and journals. Mohamed F. Younis is currently an Associate Professor in the CSEE Department at UMBC. Before joining UMBC, he was with Honeywell International Inc., where he led multiple projects for building integrated fault tolerant avionics and dependable computing infrastructure. His technical interests include network architectures and protocols, embedded systems, fault tolerant computing, and secure communication. He has published over 160 technical papers in refereed conferences and journals. He has five patents to his credit. He serves/served on the editorial board of many journals and the organising and technical programme committees of numerous conferences. He is a senior member of the IEEE.
Copyright © 2012 Inderscience Enterprises Ltd.
A least-movement topology repair algorithm
251
Uthman A. Baroudi is currently an Assistant Professor in the Department of Computer Engineering at KFUPM. Before joining KFUPM, he worked for Nortel Networks, Ottawa, Canada, in R&D for next generation wireless networks. He has extensive teaching and research experience. He taught and developed several graduate and undergraduate COE courses in the areas of wireless and computer networking and computer systems performance evaluation. His research interest includes radio resource management (RRM) and QoS provisioning for the next generation wireless networks, wireless sensor and actuator networks. He has over 40 publications in reputable international journals and conference proceedings, and 1 US patent.
1
Introduction
Recent years have witnessed a growing interest in the applications of Wireless Sensor-Actor Networks (WSANs), especially those serving in remote and harsh environments in which human intervention is risky or impractical. Space exploration, battlefield reconnaissance and coastal and border patrol are examples of the many WSAN applications. In a WSAN, a large number of sensors are deployed in an area of interest to probe their surroundings. The sensors are miniaturised and battery-operated devices that report their measurements to more powerful actor nodes. The actors process the received sensor data and collaborate on executing tasks in response. For example, actors may spray chemicals in order to extinguish a fire detected by the sensors. Robots and unmanned vehicles are example of actor nodes in practice (Akyildiz and Kasimoglu, 2004). To enable such a collaborative operation, actors need to be able to reach each other at all time. Typically, an actor motion during the execution of a task is predictable and any changes in the inter-actor topology will be pre-known and mitigated in order to ensure the strong inter-actor connectivity. However, a sudden failure of an actor may cause the network to split into multiple disjoint blocks and would thus risk a major disruption of the network operation. Given the unattended operation of WSANs, the actors have to deal with the failure autonomously. In other words, the WSAN ought to self-heal without reliance on external help, e.g. deploying a replacement of the dead actor (Younis and Akkaya, 2008). On the other hand, pursuing a distributed recovery is very challenging since the nodes in different blocks can no longer communicate. Therefore, contemporary schemes found in the literature require every node to maintain partial knowledge of the network state. To avoid the excessive state-update overhead and to expedite the connectivity restoration process, prior work rely on maintaining one or two-hop neighbour lists and predetermine criteria for node’s involvement in the recovery (Younis et al., 2008; Abbasi et al., 2009; Akkaya et al., 2010). We argue that exchanging messages in order to maintain two-hop neighbour information imposes lots of communication overhead; especially for such a dynamic network. In addition, one-hop based recovery schemes are not efficient since they often involve many actors and require long travel distances. In some mission critical applications node movement is not much appreciated and moving many actor nodes as a side effect of the recovery process could lead to an application mission
failure. For example, moving a number of actor nodes away while they are busy extinguishing a fire or life supporting natural disaster victims could lead to a disaster. Unlike prior work, this paper utilises existing path discovery activities to get and maintain topology related information and imposes no additional pre-failure communication overhead. It is worth to mention that the routing cost is not counted towards communication overhead of the proposed algorithm since data have to be routed anyway regardless the proposed algorithm is applied or not. A novel Least-Movement Topology Repair (LeMoToR) algorithm is proposed. LeMoToR relies on the local view of a node about the network to orchestrate an autonomous restoration of the strong connectivity. LeMoToR strives to relocate the least number of nodes and reduce the total travel distance and communication overhead. LeMoToR opts to localise the recovery process and operates in a distributed manner. When a node fails, its neighbours will individually consult their possibly incomplete routing table to decide on the appropriate course of action and define their role in the recovery if any. If the failed node is a cut-vertex, i.e. a node that causes the network to partition into disjoint blocks, the neighbour that belongs to the smallest block reacts. LeMoToR is applies recursively to sustain the intra-smallestblock connectivity. When a node moves its neighbours repeat the LeMoToR connectivity restoration process. LeMoToR is validated analytically and through simulation and it has shown to outperform existing schemes in terms of communication and number of node relocation overhead. The next section states the assumptions and discusses the problem. Section 3 compares LeMoToR to related work. Section 4 describes LeMoToR in detail, Section 5 presents the simulation results. The paper is concluded in Section 6.
2
System model and problem statement
A WSAN employs large number of sensor nodes and much fewer actor nodes. Sensor nodes are low-cost energy and processing constrained devices. Sensors probe the environment in their vicinity and report their findings to actor nodes through radio transmissions. On the other hand, actor nodes are more expensive that are capable of performing complex computation. Although, the communication range r of actor nodes is much higher than sensor nodes, it is insufficient for covering the entire deployment area. Figure 1 shows an articulation of the assumed WSAN model.
252
A.A. Abbasi, M.F. Younis and U.A. Baroudi
Since WSAN applications are collaborative in nature, interactor communication is required all the time. Upon receiving data from sensors, an actor processes these data and consults other actors on the best course of action. Theoretically, satellite communication channel seems attractive to provide such interactor communication facility. Nonetheless, satellite channels are intermittent in nature and expensive in price which makes them unsuitable to provide frequent inter-actor communication required by WSAN applications. Thus, contemporary terrestrial radio links constitute the backbone of the inter-actor network. At network bootstrapping, actors discover each other and form a one-connected inter-actor topology (Akkaya and Younis, 2006). Ranging technologies and localisation techniques are used by actors in order to establish a relative coordinate system and become aware of the position of each other (Youssef et al., 2005). Given the application-based interaction, an actor is assumed to know how many actors are there in the network. We assume that actor nodes can move on-demand, i.e. to enhance coverage, to perform some actuation task or to restore inter-actor network connectivity. Since the focus of this work is to restore one-connectivity at the inter-actor network level, sensor nodes are not considered in the recovery process. Therefore, in the rest of the paper, terms ‘actor’ and ‘node’ are used interchangeably. It is worth noting that sensors can reach at least one of the actor nodes directly or over a multi-hop path and movement of actor nodes would not affect the sensor-actor connectivity. The impact of actor’s failure on the network topology varies based on its position. The loss of a leaf node does not affect the paths between other actors. Meanwhile, the failure of a leaf node-vertex can partition the inter-actor topology into disjoint block. A node (vertex) in a graph is a cut-vertex if its removal, along with all its edges, produces a graph with more connected components (blocks) than the original graph. For example in Figure 2 the network stays strongly connected after the loss of a leaf actor such as A21 or a non-leaf node, like A5. Meanwhile, the failure of the cut-vertex A0 leaves nodes A4, A5 and A6 isolated from the rest of the network. Precautionary design measures can be employed in order to avoid the effect of failed cut-vertex node. These precautionary approaches often deploy redundant nodes and establish multiple nodeindependent paths between each pair of actor, i.e. form a k-connected inter-actor topology. However, establishing and maintaining k-connectivity is not only expensive in terms of resource usage but also imposes lots of constraints and complicates the management of a WSAN. The more practical approach for dealing with cut-vertices is to pursue real-time connectivity restoration which is asynchronous process that engages the nodes only in case a failure takes place. LeMoToR fits in the real-time connectivity restoration category. In this work we assume non-simultaneous node failure which is, however, not a limitation for LeMoToR. Since probability of having multi-failure at a time is very small, in the literature, most of recovery scheme assume a single failure at a time. Suppose ‘p’ is the probability of having a single failure. Thus, the probability of having two simultaneous failures is p2, p3 for three and pn for n, etc. As we know that p is a small fraction and thus it is obvious that
the probability of multiple faults diminishes. Furthermore, non-critical node failures can be handled at the network layer as network stays connectivity and might also involve node relocation (Akkaya and Younis, 2006; Younis and Akkaya, 2008). However, we focus on the failure of critical nodes (i.e. cut-vertex nodes) that cause the network to split into several sub-networks. Restoring such disjointed subnetworks is very challenging as communication among them is not possible. In the presentation of our work, we are emphasising more on the algorithmic part of connectivity restoration rather than link layer issues. In addition, to simplify the analysis, we assume that the communication range r is same for all the actor nodes in WSAN which, however, is not a requirement for our proposed algorithm. Figure 1
An articulation of a WSAN with a connected interactor network (see online version for colours)
Figure 2
An example connected inter-actor network. Nodes A0, A10, A14, and A19 are cut-vertices whose failure leaves the network partitioned into two or multiple disjoint blocks A20 A7 A19
A8 A4
A1
A21
A9
A0
A5
A15
A14
A3
A10
A6 A16
A 18
A11
A2
A17 A13
3
A12
Related work
Tolerance of occasional node failure in WSANs has received increased attention in recent years (Younis and Akkaya, 2008). Network connectivity and coverage have been deemed the
A least-movement topology repair algorithm most important properties in the context of WSANs. The pursued strategies can be generally classified into two main categories, namely: reactive and provisioned. A reactive strategy may either require the deployment of additional nodes (Lee and Younis, 2010; Senel et al., 2011) or rely on the existing resources in the network and reposition some of the healthy nodes in order to restore the lost connectivity and/or coverage (Wang et al., 2005b; Bai et al., 2006; Tan et al., 2008; Abbasi et al., 2010; Tamboli and Younis, 2010). Provisioned tolerance of node failure ensures that the network has sufficient resources at the time of network set-up to sustain uninterruptible operation regardless the loss of a subset of nodes (Basu and Redi, 2004; Hao et al., 2004; Zhang et al., 2004; Liu et al., 2006; Li et al., 2007; Lloyd and Xue, 2007; Zhang et al., 2007). Provisioned tolerance usually instruments redundancy so that a failed node will not prevent the fulfilment of certain design goals. We argue that reactive recovery strategies are the most appropriate for WSANs given their dynamic nature and the cost and challenge involved in deploying redundant actor nodes. A number of schemes have been recently proposed for restoring the network connectivity in partitioned WSANs (Younis and Akkaya, 2008). The objective of most published approaches is to enable autonomous recovery while imposing the least overhead on the nodes (Younis et al., 2008; Abbasi et al., 2009; Akkaya et al., 2010). Contemporary metrics include the total travel distance, messaging overhead and the number of involved nodes. In addition to these metrics LeDiR (Abbasi et al., 2010) constrains the recovery process with the requirement of sustaining the length of the shortest paths between actors. The rationale is some applications that are delay sensitive and ignoring the implication of topology changes will be unacceptable. To do so LeDiR strives to identify the smallest block in the partitioned network, moves the closest node in this block to the position of the failed actor and finally relocates the remaining nodes in the block in a cascaded manner as needed for sustaining the intra-block connectivity. The proposed LeMoToR approach is based on LeDiR. The main difference is that LeMoToR relaxes the path length requirements and applies the recovery process recursively for nodes that lose connectivity during the recovery. Published recovery schemes can also be categorised based on the required network state information. For example, DARA (Abbasi et al., 2009) and PADRA (Akkaya et al., 2010) rely on the node’s awareness of its one and two hop neighbours in order to detect network partitioning and determine the scope of the recovery. Meanwhile, RIM (Younis et al., 2008) requires only one-hop information at the expense of more recovery overhead. LeDiR and LeMoToR utilise the route discovery activities in populating a Short-Route-Table (SRT) that is referenced for determining the impact of the failure and determining the recovery participant. Obviously, the pre-failure overhead is minimal. The simulation results in Section 5 also demonstrate the effectiveness of the methodology even with partially populated SRT. It is worth mentioning that some work, such as Basu and Redi (2004) assumed that the network is bi-connected and block movement is used to restore the lost two-connectivity when a node fails. LeMoToR assumes only one-connectivity which is a more challenging problem.
253 Some prior work cared more about the coverage than connectivity loss in the network when a node fails. For example, mitigating the coverage hole is the main aim as mentioned by Wang et al. (2005a) in his work. Redundant sensors are determined by a grid quorum method where the network is divided into cells and the node density in the individual cells is checked. Redundant nodes are relocated to the area of the coverage hole. Meanwhile, Kasinathan and Younis (2011) focused on networks with heterogeneous node capabilities. In these set-ups the recovery is more constrained since the backup node has to possess specific set of capabilities. A heuristic approach is proposed that engages one or multiple nodes in mitigating the coverage loss. Similar to provisioning k-connectivity, some work opts to ensure that important spots are covered by multiple sensors in order to mitigate the effect of node failure in critical surveillance application (Ammari and Das, 2010). Finally, it is important to note that some published approaches have factored in both connectivity and coverage (Li et al., 2007; Tamboli and Younis, 2010). LeMoToR focuses on the connectivity restoration problem in dynamic networks, such as WSANs, in which the overhead of frequent and explicit update of the network state is not desired and coverage is cared for at the sensor rather than the actor level.
4
Least-movement topology repair
As mentioned earlier, the goal of LeMoToR is to reconnect the disjointed network while keeping the node movement as minimum as possible and involving the least number of actor nodes in the recovery process. In this section, we first give an overview of LeMoToR as a centralised solution and then explain the distributed implementation.
4.1 Problem and solution analysis Before explaining how LeMoToR works, it is important to understand the effect of network recovery on the overall node movement. Let us consider Figure 2 and assume that node A10 fails. Connectivity restoration schemes that exploit node repositioning and are very close to LeMoToR in term of complexity will recover the network by involving the neighbours of A10. For example, RIM (Younis et al., 2008) picks the one-hop neighbours and moves them to r/2 unit away from the faulty node A10. Thus, A3, A9, A11 and A14 relocate to the new positions which are r/2 unit away from A10 and strongly reconnect the network. However, the connectivity restoration process triggers further relocations of the neighbours (children) of each moved node. The resulting topology is shown in Figure 3. Nodes with grey colour are moved and somehow get involved in the recovery process. As we mentioned in Section 1, this will not be acceptable for movement-sensitive applications. LeMoToR opts to avoid such scenario by relocating the least possible nodes and confining the movement within the smallest portion of the network.
254
A.A. Abbasi, M.F. Younis and U.A. Baroudi
Figure 3
Illustrating how RIM (Younis et al., 2008) restores connectivity after the failure of node A10 in the connected inter-actor topology of Figure 1. Nodes with grey colour are moved and get involved in the recovery process
node A20 will check what nodes are reachable through A19, which are A8 and A9 in this example. Checking the entries for nodes A8 and A9 reveals that A1, A3, A7 and A10 will become consequently unreachable. The same is repeated and finally leads node A20 to conclude that only A21 is reachable and A19 is indeed a critical node. The SRT can make the same conclusion for a node that is not a cut-vertex but serves on the shortest path of all nodes. For example, in a wheel-shaped topology, the node at the centre is not a cut-vertex, yet it serves on the shortest paths among many nodes on the outer ring. The SRT points out the criticality of such a node and motives the invocation of the recovery process.
A20 A7
A4
A21
A19
A8 A1
A9 A5
A15
A0 A14
A3
A18
A6 A11
A2
A16
A17 A13
A22
A12
The main idea for LeMoToR is to replace the faulty node by selecting a neighbour node that belongs to the smallest disjointed block. In case this node gets disconnected from its children, i.e. neighbours within the block, LeMoToR is further applied recursively. This will not only move the least number of actor nodes but also limit the recovery overhead in terms of the distance that the nodes collectively travels. For the previous example when A10 fails, LeMoToR will only involve the block of node A14. In addition, LeMoToR opts to avoid the effect of the relocation on coverage and also limits the travel distance by applying itself recursively and moving a node only when it becomes unreachable to their neighbour. To explain LeMoToR in detail, we first assume that every node is aware of the entire network topology prior to the failure and thus can build the SRT for every pair of nodes. This assumption is eliminated later in this section. LeMoToR approach is distributed in nature, thus does not required complete topological information. In addition, no pre-failure special messaging is required to build the SRT. It simply can be populated through the route discovery activities in the network, e.g. when an ondemand routing protocol such as AODV is employed. The simulation results presented in Section 5 confirm that LeMoToR works well with the partial topological information. Without loss of generality, we use the hop count as a path cost. The following highlights the major steps: 1
Failure detection: To detect a node failure in the neighbourhood, an exchange of heartbeat messages is assumed in the network. After n missing heartbeats, a node F would be assumed faulty. If the failed node F is a critical node (i.e. cut-vertex), network recovery measures would be triggered on the one-hop neighbours of F. Upon detecting a node failure in the neighbourhood, one-hop neighbours of F would orchestrate the analysis to determine the impact of failure to the network connectivity. Such analysis is important to determine the criticality of a faulty node and can be done by using the SRT. Basically, a cutvertex F has to be on the shortest path between at least two neighbours of F. Consider Table 1 which lists the entries of the SRT for the network topology in Figure 2. After the failure of actor A19, which is a cut-vertex,
2
Smallest block identification: As mentioned earlier, after a cut-vertex failure the one-connected network G is split into more than one connected component, i.e. sub-network sub(G). Each sub(G) consists of few nodes of G that are one-connected to each other within the sub(G). Basically, each sub(G) is a separate ‘block’ that was connected to the other blocks in G via faulty cutvertex. LeMoToR attempts to find a block among the disjoint blocks that consists of the least number of nodes, referred to hereafter as the ‘smallest block’. Actually, LeMoToR aims to confine the node movement within the smallest block to minimise the node movement. To identify the smallest block, every one-hop neighbour of faulty node would identify the reachable set of nodes for itself and every other onehop neighbour of the failed node by using SRT. The block with the fewest nodes is identified as a smallest block. For example, let us again consider the network topology provided in Figure 2 and assume node A19 failed. Missing heartbeat messages would alert one-hop neighbours (A8, A9 and A20) of A19 that a node failure is taking place in the neighbourhood. Nodes A8, A9 and A20 would perform SRT analysis, discussed above under failure detection, to confirm that node A19 is indeed a critical node (cut-vertex). It is important to note that such analysis is mandatory to perform by onehop neighbours each time a faulty node detected in the neighbourhood in order to decide whether to proceed for recovery and be able to identify the disjoint blocks. Coming back to our example, after the analysis, node A20 will conclude that A20 can reach only A21, and thus A20 and A21 constitute a block. Node A20 would calculate its block size (i.e. number of nodes in the subblock) which is 2 and keep it safe. Now, A20 would investigate for the other possible disjointed block(s) and their sizes. It does so by checking the column of A19 in its SRT and find out that A8 and A9 are the other direct neighbours of A19. Node A20 will then repeat the analysis for each direct node of A19 to find out their block size. This will help node A20 to determine the smallest block after A19 fails. Now A20 will lead the recovery effort if it happens to belong to the smallest block, which is the case in this example. Nodes A8 and A9 will perform the same analysis and conclude that they are not part of the smallest block.
A least-movement topology repair algorithm 3
255
Replacing the faulty node: To replace the faulty node F, a neighbour node J is selected from the smallest block. Since node J normally is a gateway node that is connecting the smallest block to the rest of the network via critical (i.e. failed, moved) node, we refer to it as ‘parent’. Any node in the smallest block is considered ‘child’ if it is one-hop away from the parent node. Since LeMoToR is recursive in nature, parent and child nodes are reconsidered for every round. In other words, in each subsequent iteration, a new smallest block within the current smallest block would be identified and selected best candidate would be considered parent and child that would be any node which is one-hop away from it. The reason to select J from the smallest block is that LeMoToR strives to minimise the number of node movements during the network recovery. As we mentioned above that LeMoToR is recursive in nature, moving a node and its children from the smallest block would most probably involve the fewest actor nodes in the recovery. In case more than one actor with such characteristics exists, the closest actor to the faulty node would be picked. Any furthers ties will be resolved by selecting the actor with the least ID.
4
disconnected. To regain the connectivity, children would assume the moved parent node as a dead node and would apply LeMoToR at the children level. The smallest block at the children level would be identified. The child that belongs to the smallest block would proceed to the location of already moved parent node. This phenomenon would continue until all the nodes are reconnected with the network G. Figure 4 shows an example for how LeMoToR restores connectivity after the failure of A10. Obviously, node A10 is a cut-vertex and A14 becomes the one-hop neighbour that belongs to the smallest block (Figure 4a–c). In Figure 4d, node A14 notifies its neighbours and moves to the position of A10 to restore connectivity. Disconnected children, nodes A15 and A16, execute LeMoToR again to find out which one of them should move to the location of A14. Obviously, node A15 belongs to the smallest block and thus moves to the location of A14 to maintain the communication link (Figure 4e). Note that the reason to execute LeMoToR recursively and to identify the smallest block even among the children movement is to minimise the overall node movement. Nodes A15 would notify its only child A18, before it moves. Since A18 is the only child, it simply belongs to the smallest block and moves to location of A15 (Figure 4f). Figure 4f shows the repaired network.
Children movement: When node J moves to replace the faulty node, possibly some of its children will become
Table 1
The path predecessor matrix generated by the Floyd–Warhsell algorithm (Cormen et al., 1990) for the network topology of Figure 2. For each pair of nodes v and w, the path matrix entry P[v, w] contains a node k which is the direct predecessor of w on the shortest path to v 0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
0
–
0
0
0
0
0
0
1
1
1
3
10
2
2
10
14
14
16
15
8
19
20
1
1
–
0
1
0
0
0
1
1
1
3
10
2
2
10
14
14
16
15
8
19
20
2
2
0
–
2
0
0
0
1
1
3
3
12
2
2
10
14
14
16
15
8
19
20
3
3
3
3
–
0
0
0
1
1
3
3
10
2
2
10
14
14
16
15
8
19
20
4
4
0
0
0
–
4
0
1
1
1
3
10
2
2
10
14
14
16
15
8
19
20
5
5
0
0
0
5
–
5
1
1
1
3
10
2
2
10
14
14
16
15
8
19
20
6
6
0
0
0
0
6
–
1
1
1
3
10
2
2
10
14
14
16
15
8
19
20
7
1
7
0
1
0
0
0
–
7
1
3
10
2
2
10
14
14
16
15
8
19
20
8
1
8
0
1
0
0
0
8
–
8
9
10
2
2
10
14
14
16
15
8
19
20
9
1
9
3
9
0
0
0
1
9
–
9
10
2
2
10
14
14
16
15
9
19
20
10
3
3
3
10
0
0
0
1
9
10
–
10
11
2
10
14
14
16
15
9
19
20
11
2
3
12
10
0
0
0
1
9
10
11
–
11
2
10
14
14
16
15
9
19
20
12
2
0
12
2
0
0
0
1
1
10
11
12
–
2
10
14
14
16
15
9
19
20
13
2
0
13
2
0
0
0
1
1
3
3
12
2
–
10
14
14
16
15
9
19
20
14
3
3
3
10
0
0
0
1
9
10
14
10
11
2
–
14
14
16
15
9
19
20
15
3
3
2
10
0
0
0
1
9
10
14
10
11
2
15
–
14
16
15
9
19
20
16
3
3
2
10
0
0
0
1
9
10
14
10
11
2
16
14
–
16
16
9
19
20
17
3
3
2
10
0
0
0
1
9
10
14
10
11
2
16
18
17
–
17
9
19
20
18
3
3
2
10
0
0
0
1
9
10
14
10
11
2
16
18
18
18
–
9
19
20
19
1
8
3
9
0
0
0
8
19
19
9
10
2
2
10
14
14
16
15
–
19
20
20
1
8
3
9
0
0
0
8
19
19
9
10
2
2
10
14
14
16
15
20
–
20
21
1
8
3
9
0
0
0
8
19
19
9
10
2
2
10
14
14
16
15
20
21
–
256
A.A. Abbasi, M.F. Younis and U.A. Baroudi about the network with routes to some nodes missing in its SRT. This can happen due to changes in the topology caused by node mobility or due to the fact that a subset of actors do not need to interact and simply a route has not been discovered yet. Let α be the percentage of entries, i.e. routes between actor pair (i, j), that each node has acquired over time. Hereafter, we shall call this α as Confidence Level (CL). For example, if 50% entries of nodes Ai routing table are filled we say node Ai has 50% CL.
4.2 Distributed LeMoToR implementation The discussion above assumed that nodes are aware of the network topology and can assess the impact of the failure and uniquely identify which node should replace the failed actor. If every node in the network is communicating with all other nodes, it would be possible to fully populate the routing table and for the individual nodes to reach consistent decisions without centralised coordination. However, in many set-ups an actor may have only partial knowledge Figure 4
An example illustrating how LeMoToR restores connectivity after the failure of node A10 in the connected inter-actor topology of Figure 2 (see online version for colours) A20
A20
A7
A7
A4
A21
A19
A8 A1
A4
A9 A15
A0
A5
A1
A9 A15
A0
A5
A14
A3
A14
A3
A18
A10
A21
A19
A8
A16
A11
A2
A16
A11
A2
A17 A13
A17
A22
A13
A12
(b) A20
A20
A7
A7
A21
A19
A8 A4
A1
A4
A9 A15 A14
A3
A1
A9 A15
A0
A5
A18
A3
A6
A21
A19
A8
A0
A18
A14
A6
A16
A11
A2
A16
A11
A2
A17 A13
A17
A22
A12
A13
(d) A20
A20
A7
A7 A19
A8 A4
A22
A12
(c)
A21
A1
A15
A18
A0
A5
A0
A15
A3
A18
A14
A21
A9
A9
A3
A19
A8 A4
A1
A5
A22
A12
(a)
A5
A18
A6
A6
A14
A6
A6 A11
A2
A16
A17 A13
A12
(e)
A16
A11
A2
A17
A22
A13
A12
(f)
A22
A least-movement topology repair algorithm Figure 5
257
Pseudo code of LeMoToR
// Every node builds its shortest path routing table (SRT) based // on the route discovery activities that it initiates or serve in, e.g. // while executing a distributed routing protocol. LeMoToR(J) 1 IF node J detects a failure of its neighbour F 2 IF neighbour F is a cut-vertex node 3 IF IsBestCandidate(J) 4 Notify_Children(J); 5 J moves to the Position of neighbor F; 6 Moved_Once Å TRUE; 7 Broadcast(Msg(‘RECOVERED’)); 8 Exit; 9 END IF 10 END IF 11 ELSE IFJ receives (a) notification message(s) from F 12 IFMoved_Once || Received Msg(‘RECOVERED’) 13 Exit; 14 END IF 15 LeMoToR(J) 16 END IF IsBestCandidate (J) // Check whether J is the best candidate for tolerating the failure 17 NeighbourList[]Å GetNeighbours (F) accessing the column F in SRT; 18 SmallestBlockSize Å Number of nodes in the network; 19 BestCandidate ÅJ; 20 FOR each node i in the NeighbourList[] //Use the SRT after excluding the failed node to find the //set of reachable nodes; 21 Number of reachable nodes Å 0; 22 FOR each node k in SRT excluding i and F 23 Retrieve shortest path from i to k by using SRT; 24 IF the retrieved shortest path does not include node F 25 No. of reachable nodes Å No. of reachable nodes + 1; 26 END IF 27 END FOR 28 IF Number of reachable nodes < SmallestBlockSize 29 SmallestBlockSize Å Number of reachable nodes; 30 BestCandidate Åi; 31 END IF 32 END FOR 33 IF BestCandidate == J 34 Return TRUE; 35 ELSE 36 Return FALSE; 37 END IF
Since every node may potentially have different CL from others, upon the detection of a node failure the neighbouring nodes may have an inconsistent assessment of the impact of the node loss on the network and on which actor is the best candidate for leading the recovery. For example, in Figure 1 if node A11 was never on a route that has nodes A14, A15, A16, A17 and A18 as sources or destination, node A11 will not know that A10 is a cut-vertex. We argue nonetheless that this is rare in practice since the mobility pattern among actors is not typically high given their involvement in actuation activities. In addition, the operation in WSAN is collaborative in nature and an actor usually communicates with many others and thus the routing
table would not be sparse. The second issue is determining of the best candidate, i.e. the neighbour of the failed node that belongs to the smallest block. For example, if node A14 does not have sufficient entries in its SRT it would not know that it belongs to the smallest block and would not thus initiate the recovery process by moving to replace A10. Since the neighbours of A10 cannot reach each other, a partially populated SRT may lead to a deadlock with none of the neighbours of A10 responding to the failure and leaving the network disconnected. To handle this issue, LeMoToR imposes a time-out after which the neighbours belonging to the second largest block will move. This time multiple neighbours may be potentially
258
A.A. Abbasi, M.F. Younis and U.A. Baroudi
moving towards A10. To avoid having more than one actor replacing A10, LeMoToR requires these nodes to broadcast messages with their ID so that they pause as soon as reaching other neighbours of A10 that happen to be in a different block. The pause time would allow these neighbours to negotiate and pick the best candidate to continue on to the position of A10. We study the effect of the CL on the performance through simulation in Section 5. Figure 5 shows the pseudo code of LeMoToR. A node J would trigger LeMoToR, whenever a cutvertex node failure is detected in the one-hop neighbourhood (line 1–2). Node J would test its eligibility to move to replace the faulty node by executing the IsBestCandidate() procedure (line 3). Basically, the procedure IsBestCandidate() finds whether node J belongs to the smallest disjointed sub-network block. If so, node J notifies its children (line 4–10) and moves to the location of the faulty node. Otherwise, node J checks whether it is to perform a movement to sustain current communication links (line 11), and if so it executes LeMoToR (line 15) to find whether it belongs to the smallest block. If so, it moves to the location of the already moved parent node to maintain the communication link. Nodes only move once (lines 12–14). LeMoToR would be executed on the children node that loses direct communication link to the moved parent (neighbour).
4.3 Algorithm analysis In this subsection, we present several theorems to analyse the performance and behaviour of the LeMoToR algorithm. Theorem 1: The maximum number of nodes involved in the recovery process is O(N), where N is the number of actors in WSAN. Proof: Consider the worst case scenario where onedimensional network, as shown in Figure 6, is split into two equal blocks (sub-networks) and each block consists of (N − 1)/2 nodes. LeMoToR involves only one of the two blocks in the recovery process, simply by moving only one block towards the other. Assuming that the network is sparse and nodes are r units away from each other, where r is the node’s communication range, every node in the block would move and participate in the recovery process. Thus, the maximum number of nodes involved in the recovery ⎡ N − 1⎤ process would be ⎢ ⎥ which is O(N). ⎢ 2 ⎥ Figure 6
A1
The worst case scenario topology where N = 7 and failure of A4 has partitioned the network into two 3-node blocks. LeMoToR would involve maximum three actors in the recovery process either A3 or A5 selected to replace the faulty node followed by a series of inter-block node relocation (see online version for colours) A2
A3
A4
A5
A6
A7
Theorem 2: LeMoToR strives to minimise the total travel distance and guarantees to terminate in O(N) iterations, where N is the number of actors in WSAN.
Proof: LeMoToR selects the smallest block for recovery which is to minimise the total travel distance by moving the smallest number of nodes. Theorem 1 proves that in the worst case scenario O(N) nodes are involved in the recovery. During the entire recovery process a node can move only once which means that LeMoToR guarantees to terminate in O(N) iterations. Theorem 3: The message complexity of LeMoToR is O(N) where N is the number of actors in WSAN. Proof: LeMoToR depends on the route discovery protocol to maintain the routing table on each node, thus no special messaging is required to know the neighbours or network topology. If a node got involved in the recovery process and decided to move, it broadcasts one message to its children to notify them about its movement. Another message is broadcasted to connect with the neighbours once a node has reached to the new position. In other words, every node participating in the recovery process would broadcast only two ⎡ N − 1⎤ messages. In the worst case scenario only ⎢ ⎥ nodes would ⎢ 2 ⎥ participate in the recovery process as proven in Theorem 1 ⎡ N − 1⎤ above. Thus, the total number of messages sent is 2 × ⎢ ⎥ ⎢ 2 ⎥ which is O(N). Theorem 4: The maximum distance a node travels in LeMoToR is r where r is the actor radio range. Proof: In the worst case (Figure 6), LeMoToR can select a onehop neighbour to replace the faulty node that is at most r units away from it. When a node moves to replace the faulty node, possibly some of its children will lose direct links to it. This can disconnect the children from the rest of the network which is undesirable phenomenon. Therefore, the process is repeated recursively; i.e. the smallest block will be determined and one node will move and travel maximum r units to restore network connectivity. Thus, the maximum distance a node travels during any stage in LeMoToR is r, as shown in Figure 7. Theorem 5: The maximum convergence time of LeMoToR algorithm to restore inter-actor connectivity is O(N) where N is the number of actors in WSAN. Proof: As stated earlier LeMoToR assumes that no other failure occurs during the recovery process. Let us assume that s is the maximum time required for a node to find whether it belongs to the smallest block. Therefore, the maximum time for a neighbour ‘A’ of the failed node ‘F’ to find out the block that it belongs to is O(N.d). Basically, a node will have to check the column for ‘F’ in the SRT to identify all the other ‘d – 1’ neighbours of ‘F’. Node ‘A’ then eliminates these ‘d – 1’ actors and all nodes that are reachable through them from its row in SRT. This step is applied at most N − 1 times in a network of N actors, and node ‘A’ is a leaf node in the network. To determine whether its block is the smallest, node ‘A’ will repeat this process at most ‘d − 1’ times for the other neighbours of ‘F’. Thus, the maximum time s for a node to identify the smallest block is O(N.d2).
A least-movement topology repair algorithm
259
Theorem 1 proves that the maximum number of nodes involved in the recovery process is O(N). In addition, Theorem 5 proves that the maximum distance a node travels in the recovery process is r. Suppose t is the time to travel distance r. Assuming the worst case scenario where the node movement is sequential, the total time to restore network connectivity would be O(N) × t. Thus, the maximum convergence time of LeMoToR to restore interactor connectivity is s + O(N) × t which is O(N[d2 + t]). For a uniform actor distribution, the value of d depends only on r (Savvides et al., 2001). Thus, both d and t can be considered constant and the inter-actor connectivity would be restored in O(N). Figure 7
Assuming the worst case scenario presented in Figure 6, LeMoToR selected A3 to replace the faulty node A4 by travelling distance r. Once A3 moved to the new position, A2 will move behind it to maintain direct connectivity. Later, A1 will do the same. Since the network is onedimensional and nodes are located r units away from each other, the maximum distance travel by a node is r (see online version for colours)
A2
A1
A3
r
r
A4
A5
A6
A7
r
Theorem 6: For a WSAN with a homogenous set of actors LeMoToR restores connectivity with minimal loss in network coverage with Cnew ≥ Co − Ca , where Ca is the nominal coverage of any actor node, Co is the pre-failure coverage of the network, i.e. C o = ∪ C i , where Ci is the i
coverage contributed by actor i. Proof: Consider a homogenous WSAN where all actors have similar capabilities. When a cut-vertex node ‘F’ fails the network gets partitioned into m disjoint blocks. All partitions except Bs, from which a node ‘G’ will replace ‘F’, will preserve its initial coverage. Thus, Co can be expressed as: B
Co = C0F ∪ C 0 1 ∪ C0B2 ∪ … C0Bm
After ‘F’ fails, C = 0 , and when LeMoToR is applied, F 0
Bk node ‘G’ fill in for ‘F’ and Cnew = C0Bk ∀1 ≤ k ≤ m and k ≠ s . Hence,
Cnew = C
G 0
∪C ∪ ∪C ∪ ∪C B1 0
Bs new
Bm 0
Then, if this movement of ‘G’ causes new partitioning, i.e. connecting Bk ∀1 ≤ k ≤ m and k ≠ s and splitting (Bs−‘G’) from the network, LeMoToR is applied again to replace ‘G’ with its closest neighbour in Bs. In the worst case (Figure 6), this repeats until the smallest block is of just one node, which will move to reconnect with the rest of the network. Therefore, each time the network is partitioned, only one node from the smallest block will reposition at the location of the failed or
departed node. For a homogenous set of actors, the coverage of the individual nodes will be the same and the network coverage will be degraded in the spot covered by the last relocated node, which equals C0F . Thus, in this case: Bs Cnew = C0Bs − C0G
and consequently, Cnew = Co − Ca .
If the intra-block topology of Bs is dense, there may be some coverage overlap among neighbours and moving a node may not cause a loss of coverage that account for all its range. In that case: Bs Cnew ≥ C0Bs − C0G
and hence, Cnew ≥ Co − Ca ,
Theorem 7: LeMoToR is a robust approach in the sense that a full routing table is not necessary for the proper functionality of the algorithm.
Proof: As LeMoToR is based on using the SRT table to find the smallest block resulting from the faulty node ‘F’, the question is whether a not complete SRT table will lead to a deadlock. In other words, a neighbour ‘A’ of ‘F’ may not have a complete SRT and assume that it is not part of the smallest block and expects that another node ‘B’ will step forward to replace ‘F’. Meanwhile, the SRT of ‘B’ does not imply that node ‘B’ is the one to respond. This situation may lead to deadlock that keeps the network partitioned. The time-out parameter of LeMoToR will prevent such a deadlock and guarantee convergence since nodes ‘A’ and ‘B’ will eventually stop waiting and assume responsibility for the recovery. Although this situation may lead to a nonoptimal decision in which multiple neighbours respond and one of them has to abandon its role after incur some motion overhead, it makes LeMoToR a robust recovery scheme.
5
Performance analysis
The performance of LeMoToR is validated through simulation. In this section, we describe the simulation environment and present the results.
5.1 Simulation environment and performance metrics The performance of LeMoToR has been validated using a WSAN simulator developed by us in Visual C++. The simulator has already been validated against extensive simulation experiments as well as existing approaches in the literature. In the simulation experiments, connected topologies have been created. Actors are placed in an area of 1000 × 600 m2 using a random uniform distribution. The Shortest Path Routing Table (SRT) is formed using the Floyd–Warshall algorithm. This implicitly implies that every node is aware of the entire network topology. We then mimic the effect of CL by
260
A.A. Abbasi, M.F. Younis and U.A. Baroudi
randomly removing (1 − α) % of entries from the copy of the global SRT stored at the individual nodes in order to capture the performance of a distributed implementation. All cutvertices in the topology are identified and one of them is randomly picked as the failed node and LeMoToR is applied to restore connectivity. The following parameter is used to vary the characteristics of the WSAN topology in the experiments: •
•
Number of deployed actors (N): This parameter affects the node density and the WSAN connectivity. Increasing the value of N makes the WSAN topology highly connected. When studying the effect of network size, the number of actors has been varied from 20 to 100 while fixing the radio range (r = 100 m). Communication range (r): All actors have the same communication range r. The value of r affects the initial WSAN topology. While a small r creates a sparse topology, a large r boosts the overall network connectivity. The node count has been fixed at 100, while varying the communication range (25–200 m).
the network gets larger. However, in sparse topologies LeMoToR does not appear to have advantage over RIM. While RIM has outperformed all other schemes for small network size, its performance degrades steadily as the network size grows. Considering the effect of communication range on the total travelled distance, Figure 8b shows that LeMoToR has a very stable behaviour and confirms the pervious finding of minimum travelled distance. The efficiency of LeMoToR depends on the network traffic and activities since it directly affects how SRT is populated. Nonetheless, LeMoToR does still converge even if partial SRT is available. Figure 8
The total distance travelled by actor nodes where (a) network size is varied (with r = 100), (b) communication range is varied (with N = 100) (see online version for colours)
In order to evaluate the performance of LeMoToR, we quantify the overhead of the recovery process using the following three metrics: •
Total distance travelled: Reports the distance that the involved nodes collectively travel during the recovery. This can be envisioned as a network-wide assessment of the efficiency of the applied recovery scheme.
•
Number of relocated nodes: Reports the number of nodes that moved during the recovery. This metric assesses the impact of the restoration algorithm on the ongoing activities by other actors as well as the scope of the connectivity restoration within the network.
•
Number of exchanged messages: Tracks the total number of messages that have been exchanged among nodes. This metric captures the communication-related overhead.
For each simulation set-up 30 different network topologies are considered and the average values are reported. We observed that with 90% confidence level, the simulation results stay within 6%–10% of the sample mean.
5.2 Simulation results As we mentioned earlier LeMoToR strives to restore the network connectivity while minimising the number of relocated nodes. We compare the performance of LeMoToR to LeDiR (Abbasi et al., 2010) and RIM (Younis et al., 2008). The movement technique and operation of LeMoToR is closer to RIM than any other published scheme since it is designed particularly to restore the network connectivity with minimum messaging overhead. In addition, LeMoToR resembles LeDiR with the exception of the relaxation of the path-length constraint. Figure 8 shows the total travelled distance overhead under all considered approaches. It clearly indicates that LeMoToR and LeDiR have nearly similar performance and scale well as
(a)
(b) As stated earlier, the decrease in the CL level means fewer entries in the actor’s SRT and less information for actor to make the right assessment of the scope of the failure and define the most appropriate recovery plan. This leads to an increase in the likelihood of wrong decision making and results in more travel overhead. We noticed that this happen when the number of entries in the SRT is below 30% for all the nodes in the topology which is very rare. However, Figure 8 shows that LeMoToR stays robust and yields results close to the optimal with random CL. In other words, despite the incomplete SRT that some nodes have, i.e. LeMoToR’s performance matches the centralised implementation that bases the decision on knowing the entire network topology. While simulating LeMoToR, initially we assume that all the nodes are deployed together and thus have almost same CL. In other words, all nodes are placed in the topology
A least-movement topology repair algorithm with the same number of shortest path routing entries in their SRTs. Furthermore, we have tested the performance of LeMoToR with heterogeneous CL; means that some nodes are missing 30% SRT entries, some missing 50% SRT entries and some missing 70% SRT entries. This mimics the case when nodes are deployed in batches and the case when the traffic density is different throughout the network. In Figures 8a and 8b, the curves for LeMoToR and LeDiR with random CL reflect the performance with heterogeneous CL values and the results are very close to those of centralised implementations. Considering the number of relocated nodes during the recovery process, Figure 9 indicates clearly that LeMoToR outperforms all other approaches by moving fewer nodes during the recovery, especially for dense and highly connected topologies. Unlike RIM, LeMoToR and LeDiR try to relocate nodes that belong to the smallest block in order to avoid triggering large scale movement of child actors. Furthermore, LeMoToR extends the application of this mechanism to all child actors. This feature makes LeMoToR relocates the least number of actor nodes among contemporary approaches. Figure 9
Number of actors that moved during the recovery while varying (a) the network size (with r = 100), (b) the communication range (with N = 100) (see online version for colours)
261 pre-failure messaging overhead. The only communication cost incurred during the recovery is when a node informs its children about its movement or broadcasts the successful relocation. Nevertheless, LeMoToR requires fewer messages than LeDiR. Table 2
LeDiR # of Actors
(b) With respect to the number of messages, again LeMoToR does very well by introducing noticeably less messaging overhead as shown in Tables 2 and 3.While RIM requires maintaining onehop neighbour information, LeMoToR as well as LeDiR leverage the available route discovery process and do not impose
RIM
Central.
LeMoToR
Dist. Rand. Dist. Rand. Central. CL CL
20
30
406
7
40
56
1608
60
85
3613
80
115
6406
17
6401.1
5.8
100
152
10017
23
10010.5
6.1
Table 3
402.8
4.6
12
1604.2
6.2
14
3605.75
5.2
Number of messages sent with varying actor radio range LeDiR
Radio Range RIM
6 (a)
Number of messages sent with varying number of actors
Central.
LeMoToR
Dist. Rand. Dist. Rand. Central. CL CL
25
112
10010.9
12
10005.4
8
50
113
10009.4
16
10004.6
10.4
75
121
10010.7
17
10007.4
10.6
100
163
10017.3
22
10011.8
16.7
125
143
10013.4
26
10010.2
12.8
150
351
10016.2
33
10015
14.4
200
1072 10021.1
62
10018.9
50.5
Conclusion
WSANs can serve applications in harsh environments, in which actor nodes may be subject to damage. The collaborative and autonomous operation of the actors requires sustaining connectivity at all time and thus an actor failure must be tolerated in a distributed manner while imposing the least overhead. This paper has tackled this important problem and proposed a new distributed LeMoToR algorithm. LeMoToR relies only on the local view of the network and does not impose pre-failure overhead. The performance of LeMoToR has been validated analytically and through analysis and extensive simulation experiments. We have also compared LeMoToR to a centralised version and to two other published schemes. The results have demonstrated that LeMoToR relocates the least number of actors to re-establish network connectivity after failure. LeMoToR also works very well in dense networks and matches the performance of the centralised implementation despite the partial knowledge that the nodes have about the network topology. This finding demonstrates the robustness of the algorithm and a full routing table is not necessary for its proper functionality. Furthermore, this feature makes LeMoToR reliable and easy to implement.
262
A.A. Abbasi, M.F. Younis and U.A. Baroudi
Acknowledgements The authors would like to acknowledge the support of King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia, under the project # IN090006. The second author is also supported by the National Science Foundation, award # CNS 1018171.
References Abbasi, A., Younis, M. and Akkaya, K. (2009) ‘Movementassisted connectivity restoration in wireless sensor and actor networks’, IEEE Transection on Parallel and Distributed Systems, Vol. 20, No. 9, pp.1366–1379. Abbasi, A., Younis, M. and Baroudi, U. (2010) ‘Restoring connectivity in wireless sensor-actor networks with minimal topology changes’, Proceedings of the IEEE International Conference on Communications (ICC 2010), 23–27 May, Cape Town, pp.1–5. Akkaya, K., Senel, F., Thimmapuram A. and Uludag, S. (2010) ‘Distributed recovery from network partitioning in movable sensor/actor networks via controlled mobility’, IEEE Transactions on Computers, Vol. 59, No. 2, pp.258–271. Akkaya, K. and Younis, M. (2006) ‘COLA: a coverage and latency aware actor placement for wireless sensor and actor networks’, Proceedings of the IEEE Vehicular Technology Conference (VTC-F’06), 25–28 September, Montreal, Que, pp.1–5. Akyildiz, I. and Kasimoglu, I. (2004) ‘Wireless sensor and actor networks: research challenges’, Ad Hoc Network Journal, Vol. 2, No. 4, pp.351–367. Ammari, H. and Das, S. (2010) ‘A study of k-coverage and measures of connectivity in 3D wireless sensor networks’, IEEE Transactions on Computers, Vol. 59, No. 2, pp.243–257. Bai, X., Kumar, S., Ding, X., Yun, Z. and Lai, T. (2006) ‘Deploying wireless sensors to achieve both coverage and connectivity’, Proceedings of the 7th ACM International Symposium on Mobile Ad Hoc Networking and Computing, pp.131–142. Basu, P. and Redi, J. (2004) ‘Movement control algorithms for realization of fault-tolerant ad hoc robot networks’, IEEE Networks, Vol. 18, No. 4, pp.36–44. Cormen, T., Leiserson, C. and Rivest, R. (1990) Introduction to Algorithms, 2nd ed., MIT Press Cambridge, MA. Hao, B., Tang, H. and Xue, G. (2004) ‘Fault-tolerant relay node placement in wireless sensor networks: formulation and approximation’, Proceedings of the Workshop on High Performance Switching and Routing (HPSR 2004), pp.246–250. Kasinathan, K. and Younis, M. (2011) ‘Distributed approach for mitigating coverage loss in heterogeneous wireless sensor networks’, Proceedings of the 3rd IEEE International Workshop on Management of Emerging Networks and Services (MENS 2011), 5–9 December, Huston, Texas, USA. Lee, S. and Younis, M. (2010) ‘Optimized relay placement to federate segments in wireless sensor networks’, IEEE Journal of Selected Areas in Communications, Vol. 28, No. 5, pp.742–752.
Li, D., Cao, J., Liu, M. and Zheng, Y. (2007) ‘K-connected target coverage problem in wireless sensor networks’, Proceedings of the 1st International Conference Combinatorial Optimization and Applications (COCOA 2007), pp.20–31. Liu, H., Wan, P. and Jia, X. (2006) ‘On optimal placement of relay nodes for reliable connectivity in wireless sensor networks’, Journal of Combinatorial Optimization, Vol. 11, pp.249–260. Lloyd, E. and Xue, G. (2007) ‘Relay node placement in wireless sensor networks’, IEEE Transactions on Computer, Vol. 56, No. 1, pp.134–138. Savvides, A., Han, C. and Srivastava, M. (2001) ‘Dynamic finegrained localization in ad-hoc networks of sensors’, Proceedings of the 7th Annual ACM International Conference on Mobile Computing and Networking (MOBICOM’01), 16–21 July, Rome, Italy, pp.166–179. Senel, F., Younis, M. and Akkaya, K. (2011) ‘Bio-inspired relay node placement heuristics for repairing damaged wireless sensor networks’, IEEE Transactions on Vehicular Technology, Vol. 60, No. 4, pp.1835–1848. Tamboli, N. and Younis, M. (2010) ‘Coverage-aware connectivity restoration in mobile sensor networks’, Journal of Network and Computer Applications, Vol. 33, pp.363–374. Tan, G., Jarvis, S., Kermarrec, A. and Rennes, I. (2008) ‘Connectivity-guaranteed and obstacle-adaptive deployment schemes for mobile sensor networks’, Proceedings of the 28th International Conference on Distributed Computing Systems, 17–20 June, Beijing, pp.429–437. Wang, G., Cao, G., Porta, T. and Zhang, Z. (2005a) ‘Sensor relocation in mobile sensor networks’, Proceedings of 24th IEEE International Conference on Computer Communications (INFOCOM 2005), 13–17 March, Miami, FL, USA, Vol. 4, pp.2302–2312. Wang, Y., Hu, C. and Tseng. Y. (2005b) ‘Efficient deployment algorithms for ensuring coverage and connectivity of wireless sensor networks’, Proceedings of the 1st International Conference on Wireless Internet (WICON’05), 10–14 July, pp.114–121. Younis, M. and Akkaya, K. (2008) ‘Strategies and techniques for node placement in wireless sensor networks: a survey’, The Journal of Ad-Hoc Networks, Vol. 6, No. 4, pp.621–655. Younis, M., Lee, S., Gupta, S. and Fisher K. (2008) ‘A localized self-healing algorithm for networks of moveable sensor nodes’, Proceedings of the IEEE Global Telecommunication Conference (Globecom’08), 30 November–4 December, New Orleans, LO, pp.1–5. Youssef, A., Agrawala, A. and Younis, M. (2005) ‘Accurate anchor-free localization in wireless sensor networks’, Proceeding of 1st IEEE Workshop on Information Assurance in Wireless Sensor Net, pp.465–470. Zhang, L., Wang, X. and Dou, W. (2004) ‘A K-connected energysaving topology control algorithm for wireless sensor networks’, Proceedings of the 6th International Workshop on Distributed Computing (IWDC 2004), pp.520–525. Zhang, W., Xue, G. and Misra, S. (2007) ‘Fault-tolerant relay node placement in wireless sensor networks: problems and algorithms’, Proceedings of 26th IEEE International Conference on Computer Communications (INFOCOM 2007), 6–12 May, Anchorage, AK, pp.1649–1657.