A Fault-Tolerant Network Management Architecture for Wireless Sensor Networks Muhammad Zahid Khan, Madjid Merabti, Bob Askwith, Faycal Bouhafs School of Computing and Mathematical Sciences, Liverpool John Moores University Byrom St. Liverpool, L3 3AF, UK
[email protected], {M.Merabti, R.J.Askwith, F.Bouhafs}@ljmu.ac.uk Abstract- Energy-efficient network self-organization and fault tolerance have been identified as key challenges in the design and operations of Wireless Sensor Network (WSNs). To address these challenges, we proposed a Zone-based Fault-Tolerant Management Architecture (ZFTMA) for WSNs. The proposed architecture is composed of two novel contributions: an energyefficient network self-organization scheme and a fault management architecture that offers efficient fault detection and recovery mechanisms to make the network fault-tolerant. Analysis and discussion reveals that our proposed ZFTMA architecture can be applied to the design of WSNs protocols and applications that require energy efficiency, fault-tolerance, maximize network lifetime and scalability.
I.
INTRODUCTION
Wireless Sensor Networks (WSNs) make extensive use of resource-scarce (limited power, low processing and communication) and tiny wireless sensor devices, which are deployed over a large geographical area to provide important environmental information for applications such as security surveillance, structure monitoring, precision agriculture and pervasive health monitoring [1]. WSNs provide several advantages over traditional networks, such as large-scale deployment, high-resolution sensed data, and application adaptive mechanisms. However, due to their unique characteristics (having dynamic topology, ad-hoc and unattended deployment, huge amount of data generation and traffic flow, limited bandwidth and energy) and tight integration to the physical environment, WSNs pose considerable challenges for network management and make application development nontrivial. In network management of WSNs, energy-efficient network self-organization is one of the main challenging issues. Self-organization is the property which the sensor nodes must have to organize themselves to form the network. Self-organization of WSNs is challenging because of the tight constraints on the bandwidth and energy resources available in these networks [2]. Moreover, due to the deployment of WSNs in hostile and un-attended environments faults and failures are normal facts, therefore, fault tolerance and reliable data dissemination is also of great importance [3]. Thus, energy-efficient network selforganization and fault tolerance have been identified as one of the key challenges in the design and operations of WSNs. To address the above mentioned challenges, we proposed a Zone-based Fault-Tolerant Management Architecture (ZFTMA) for WSNs. The proposed architecture is composed of two novel contributions: an energy-efficient network self-
organization scheme and a fault management architecture that offers efficient fault detection and recovery mechanisms to make the network fault-tolerant. In our new zone-based network organization scheme, sensor nodes organize themselves into multiple clusters to construct a fully connected sensor network with minimum resource utilization; hence saving energy and communication bandwidth. For faulttolerance, we propose a fault management architecture which carries out localized fault detection and recovery through a hierarchy of sensor nodes (Central Manager, Zone Manager, and Cluster-Head). Analysis and discussion reveals that our proposed ZFTMA architecture can be applied to the design of WSNs protocols and applications that require energy efficiency, fault-tolerance, maximize network lifetime. In addition, ZFTMA also distributes management tasks through a hierarchy of nodes provide localized management, low-latency and reliable data dissemination and scalability. The rest of the paper is organized as follows: section II provides the background with literature review. In section III, we introduce our proposed architecture and explain its different operations. In Section IV some discussion takes place to highlight significant features of our architecture. Finally, section V, provides the conclusion with further work. II. BACKGROUND Network management is the process of managing, monitoring, and controlling the behavior and performance of a network [4]. The individual sensor nodes in WSNs are also intrinsically resource-scarce. Collectively, these sensor devices have considerable processing capabilities, but not individually. In addition, due to the large scale deployment of sensors in remote environments; managing numerous sensor nodes individually is not a good solution. Therefore, network selforganization is an essential requirement for large scale WSNs. By self-organization we mean the process of autonomous network formation and routing structure. Furthermore, the mechanisms of self-organization could provide many solutions in WSNs. For example, self-organization could change the density of sensor nodes and traffic pattern, or help to reconfigure the network topology in the case where a node moves or fails [2]. Sensor nodes in WSNs are expected to operate autonomously for a long period of time and may not be easily approachable for battery replacement and maintenance due to
their physical deployment location; hence faults and failures are normal facts in WSNs [3]. Thus, in order to guarantee the network quality of service and performance, it is essential for the WSNs to be able to detect faults, and to perform something akin to healing and recovering from events that might cause faults or misbehaviour in the network. Therefore, faulttolerance should be seriously considered in the design of WSN’s applications [5]. Fault tolerance is the ability to ensure the functionality of the network in the events of faults and failures. A set of functions or applications designed specifically for this purpose is called a fault-management platform, which is an integral part of a network management system. In this section, we will discuss relevant literature within the area of network management of WSNs with respect to faulttolerance. The architecture-based solutions for network management in WSNs are usually classified into three categories according to their management system network architecture [4, 6, 7]: Centralized, Distributed, and Hierarchical. Our work is mainly based on hierarchical clustering architecture. Therefore, we will give a detailed literature review of hierarchal clustering based approaches. Hierarchical management architecture is a hybrid between the centralized and distributed approaches. Sub-controller or managers are distributed throughout the network in a tree shape hierarchical manner, having levels of lower and higher level of hierarchy. Hierarchical architectures are further categorized into the following two main classes. A. Hierarchical clustered-based schemes Most of the contemporary management architectures have used the clustering-based hierarchical approaches. Hierarchical clustering introduces an extra level of management nodes that facilitate the distribution of control over the entire network. It saves energy and reduces network contention by enabling locality of communication [8]. In clustering paradigm, sensor nodes in the network are grouped together to efficiently relay the sensed data to the Sink/Base Station. Each group of sensor or cluster nodes has a cluster head node. Examples are: Distributed fault detection by using clustering mechanisms [911], WinMS [12] and localized fault-tolerant event boundary detection in sensor network [13]. To improve the robustness and efficiency of clustered-based scenario, Lai and Chen [14] proposed a CMATO (ClusterMember-based fAult Tolerant mechanism) algorithm. CMATO views the cluster as a whole and takes advantage of the intercluster monitoring of nodes to detect the faults. When the cluster member detects a fault that is caused by the cluster head, they act co-operatively to select a new cluster head to replace the failed one. B. Self-Managed Schemes Self-managed fault management means that a WSN must perform fault management tasks and services with a minimum or no human intervention with the goal of promoting network [15] productivity and quality of service . The self-managed fault tolerant WSNs must be able to detect and recover from
various networks and sensor faults locally in a distributed way with minimum resource utilisation [12]. Yu et al. [8], proposed a biologically inspired self-managed fault management architecture. The proposed architecture fully distributes the management tasks among different sensor nodes in the network. The scheme introduces more self-managing functions to sensor nodes, which encourages them to be more self dependent on monitoring their own status instead of frequent consulting with their cluster-head. In additions, they also give a solution for faulty nodes replacement in a self-configurable WSN. However, it fails in situation when a node is isolated or disconnected from the network. It is light of the literature review, it can be concluded that these approaches suffer from problems such as insufficient scalability, availability and flexibility, when network becomes more distributed. Most existing approaches mainly focus on failure detection; however, there is still no comprehensive solution for fault management in WSNs from the management architecture perspective. Fault recovery mechanisms are mainly application specific (e.g. gateway recovery, common node recovery etc) and focus on small region or individual nodes thereby are not fully scalable. Hsin et al. [16] require the network to be pre-configured, which is very costly for resource constrained WSNs. Most existing approaches [17] in WSNs isolate failed or misbehaving nodes directly from the network communication, but there is no adequate fault recovery procedure available. It is clear from the above discussion that energy-efficient network management is an important aspect to be seen in the context of WSNs. In particular a fault-tolerant network management architecture is required which can detect and recover fault on an efficient basis. To address these challenges, we propose a Zone-based Fault-Tolerant Management Architecture suitable for large scale WSNs. Our proposed ZFTMA architecture self-organizes the network with minimum resource utilization. Fault-tolerance is achieved by incorporating a fault management platform which detect and recover from various faults. III. ZONE-BASED FAULT-TOLERANT MANAGEMENT ARCHITECTURE (ZFTMA) FOR WSNS Our proposed ZFTMA self-organize all sensor nodes into multiple clusters to structure a fully connected WSN and perform efficient fault management operations to make network more fault tolerant. ZFTMA divides the whole network into four symmetric zones and assigns a resourceful node (knows as a Zone Manager) to each zone to distribute management tasks throughout the whole network which reduce the message exchange between nodes and Sink, hence conserve energy. In this section, we will first briefly describe the network specifications and assumptions we have made needed for the design of ZFTMA. Secondly, we present the ZFTMA system overview to clearly understand the underlying design of the architecture. Details of ZFTMA network specifications and assumptions are as follows: • We assume static and homogenous sensor nodes. Homogenous sensors have the same transmission power and radio range.
•
•
• •
Network is divided into four Zones (such as Z1, Z2, Z3 and Z4) through Cartesian coordinate system; furthermore each zone is assigned a resourceful node: Zone Manager (ZM), which is in 1-hop direct communication with the Central Manager (CM) or sink. We assume a sensor network to be in two-dimension plane. In Cartesian coordinate system, each point has an X-coordinate representing its horizontal position, and a Y-coordinate representing its vertical position. These are typically written as an order pair of (X, Y). In ZFTMA Cartesian co-ordinate system is applied to divide the sensor network plane area into 4 zones. CM resides at the centre of the sensor filed. This gives us the advantage that central manager will be in equal distance to all ZMs. Ideal percentages of cluster heads are selected at random in priori. 5 % of nodes are selected as a cluster head in ZFTMA. Calculation of ideal and optimal number of cluster heads is available in [18]. The authors verified that the optimum number of cluster is around 3-5 for the 100node network.
A. System Overview ZFTMA provides a set of management functions that integrate network self-organization, data dissemination, and fault management of all the participating entities of a network, while taking into account the unique characteristics and restrictions of WSNs. Our aim is to design self-organized WSNs with forming multiple clusters in a distributed manner to cover all communicating node, while facilitating energyefficient operations. In ZFTMA, we propose a clustering mechanism exclusively from a management perspective. To facilitate this we developed a Zone-based four tier hierarchical clustered-based sensor networks: Common Sensor Nodes (SN), Cluster head Nodes (CH), Zone Manager Nodes (ZM) and Central Manager (CM), as illustrated in Fig. 1.
(see Fig. 1). Each zone will be treated as a sub-network and ZM will only obtain management information from a portion of the network to which it is assigned. During the data dissemination phase, ZM will perform data aggregation for its Zone and only results will be sent to the CM, which will conserve a lot of bandwidth and ultimately will maximize network lifetime. B. Network Self-Organization in ZFTMA ZFTMA self-organize all sensor nodes into multiple clusters to structure a fully connected WSN. In ZFTMA the cost of network self-organization is very small, since the CHs are initially selected randomly and autonomously through a random probability function, explained in the next sub-section. Our network self-organization process consists of network set-up phase, and data acquisition phase. We explain the network set-up phase in this section, whilst we have inherited the data acquisition procedure from the well-known LEACH [18] protocol. Network set-up phase represents the beginning of the self-organization process. It consists of two main procedures: Cluster-Head (CH) selection, and Cluster formation. 1) Cluster-Head Selection: In ZFTMA, the network self-organization process is initiated by the CM that starts by broadcasting a Clustering Initialization message to its 1-hop away ZMs, once the WSN is deployed. When a ZM receives this message, it initiates the CH selection process, which is followed by cluster formation procedure. In ZFTMA, initially an optimum number of sensor nodes elect themselves as cluster based on simple random probability selection function. The problem of optimum number of cluster heads nodes in a network has been discussed in [18]. Their result shows that 5% of nodes in the network operating as cluster-head can achieve good performance in a homogenous sensor network with various parameter settings. Let say there are N number of sensor nodes randomly distributed in an area . A sub-set of sensor nodes, K select themselves as a CH node based on simple random probability. out of nodes is chosen Each individual sensor node in randomly and entirely by chance. Such that each node has the same probability of being chosen as a cluster head, moreover, there is no chance that a specific single sensor node occurs more than once in selected sensor nodes. This is simply done With-out replacement (Simple Random Sampling Without Replacement – SRSWOR) of a selected node, so that one deliberately avoids choosing any number of nodes more than once. Thus in this case we have:
Figure 1. Zone‐based Fault Management Architecture (ZFTMA)
Energy conservation is one of the core issues in the context of network management. One way to conserve energy is to distribute management tasks across the network and minimize the amount of control traffic. To accomplish this goal, we divide the network into four sub-zones and assign a resourceful node that we refer to as Zone Manager (ZM) to perform management tasks for the local zone in a distributed manner
Where is the optimum number of nodes being selected as a cluster head out of nodes. This implies that all combinations of CH nodes have the same probability to be selected as a CH. The probability of selected nodes can be model as:
2) Cluster Formation: When CH selection process ends, cluster formation process starts simultaneously in every zone of the network. Each sensor node becomes a part of a cluster by choosing a CH that is closer to itself based on the received signal strength (RSSI). The mechanism of clustering and selforganization is described with the help of the scenario shown in Fig. 2.
Figure 2. Cluster formation process in ZFTMA
Randomly selected cluster head nodes in each zone broadcast a CH join advertisement message (ch_join_adv) which contains the CH’s ID, to every node in its radio range, Fig. 1 (b). Initially, each sensor node is in a non-associated state. When a node receives a ch_join_adv message, then the node will evaluate the Received Signal Strength Indication (RSSI) from each CH. Then it sends the join request message (node_join_req) along with its node ID (node_id) to a CH from which it receives the highest signal strength Fig. 1 (c). Based on the strongest RSSI a node becomes a member of that cluster. In this way every sensor node recognizes it’s associated CH and along with the node_join_req each node also broadcasts a topology reply packet to construct routing path. The topology reply packet includes the sensor node’s location and energy level Elevel. This packet is helpful to determine routing path from one to the other node and could also be used to perform other relevant tasks. Each CH builds a cluster table which contains node id and residual energy level of its member senor nodes. With this strategy, sensor nodes self-organize themselves to form a connected network. The network is partitioned into certain number clusters and each node belongs to a single cluster. • Case 1 – If there is a situation that there are two CHs, which are in close proximity to each other, and they receive each others’ ch_join_adv, then in this case one of the CH will surrender its position to be as a CH in the favor of other. CH surrender is based on random selection between the two and it is done on mutual agreement basis. • Case 2 – If a node does not receive a ch_join_adv for a pre-defined time period Twait, then it will automatically advertises itself as a CH node and will broadcasts a ch_join_adv and will create its own cluster.
C. Fault-Tolerance in ZFTMA Fault tolerance is the ability to ensure the functionality of the network in the events of faults and failures [3]. A set of functions or applications designed specifically for this purpose is called a fault-management platform. In our proposed architecture, we incorporate fault management platform as an integral part of the network management infrastructure in order to make WSNs fault tolerant. In ZFTMA, we primarily concentrate on node permanent and potential faults. We assume that software applications are usually fault-free, and focus on the node’s hardware permanent and potential faults that occur due node’s battery depletion and sudden crash. In this section we explain the fault management functionalities of our proposed architecture. We elucidate the CH rotation and load balancing mechanism of ZFTMA, and explain that how it avoids the pre-mature death of CH nodes to make the network more fault tolerant. We further explain the fault detection and recovery procedures of our proposed fault management architecture. 1) Cluster-Head Rotation and Load Balancing: Due to the intra-cluster communications, inter-cluster communication, and data processing, CH nodes expend more energy than noncluster head nodes. As a result the CH nodes die before other nodes. However, it is required to ensure that all the nodes run out of the better at about the same time, so that very little residual energy is wasted when the network expires. One way to avoid the premature death of a CH node is to rotate the role of a CH node among nodes in a WSN based on the available energy level of nodes. This will help to evenly distribute the burden carried by a CH among all nodes, thus all nodes will have approximately the same lifetime. There are many cluster based sensor network management schemes in the WSN literature out of which LEACH [18], HEED [19], and EDAC [20] are some of the famous dynamic CH role rotation schemes. EDAC use a dynamic residual energy based method to trigger a CH rotation process where as LEACH and HEED uses pre-determine periodic time based CH role rotation. If a CH role is rotated after a pre-determined number of data gathering rounds, then with small data gathering rounds it will result in excessive overhead during the CH rotation phase. On the other hand if the number of data transmission rounds is large before a CH rotation is triggered, the CH node would not have enough energy left to act as ordinary sensor node after relinquishing the CH role [21]. Unlike the above mentioned approaches, ZFTMA is based on self-managed approach; CH node monitors its own residual energy and triggers the CH rotation process if its residual energy drops below a certain threshold: EThrs_level. Hence, the autonomous energy-driven CH rotation scheme reduces the frequency of rotation. The CH rotation process in our proposed ZFTMA is as follows. During the cluster formation and data acquisition phase, CH creates a cluster table for all its member sensor nodes along with their node_ids, and updates this table at each data gathering cycle. Each CH monitors its own energy level, when its energy level reaches to a certain predefined threshold value EThrs_level, then CH triggers the CH rotation process. Then CH analyzes the cluster table, and selects a node
with highest energy level as the next CH. To inform all other nodes, a message is broadcast with the ID (next_ch_id_adv) of the new CH head. After that the current CH relinquishes its position and joins the new CH as a member sensor node. In the same way, all other member nodes join the new CH with the node_join_req. This cluster head rotation process continues till the end of data gathering cycles. 2) Sensor node fault detection: Fault detection is the first phase of fault management, where an unexpected failure in the networks should be properly identified by the network system. The ZFTMA architecture is designed such that CH nodes are the entities responsible for detecting permanent and potential faults that occurs in their local clusters. We adopt a passive model for the in-cluster fault detection carried out locally by the CH, because passive model is more energy-efficient and lightweight with minimum communication overhead [17]. As described in earlier, during the cluster formation phase, the CH node creates a list of all its associated member sensor nodes and their respective energy levels Elevel. The CH keeps updating this list during the data acquisition phase. By analyzing the data packet received, the CH can identify those nodes which are not sending data. Due to the nature of the wireless medium, there is a possibility that the data sent by the node is lost due to collision or interference and does not reach to the CH. In such case, the CH will flag this node and waits for a specific time interval Twait. If the CH does not receive data from this node within this time interval, the CH declares the node as faulty, and disseminates this information to the rest of the network, so as to initiate the recovery procedure. 3) Cluster-Head node fault detection: In hostile environments, unexpected failure of CH may partition the network or degrade application performance; therefore, CH node fault detection is very important. In our hierarchical model, ZM performs the CH node fault detection of its zone. During the data acquisition phase, ZM constructs the overall topology of the zone and maintains a list of all the CHs in it zone. The ZM then analyzes the data received from all of the CHs. If the ZM does not receive a data packet from a CH within a time interval TDwait, it flags this CH as a faulty node and disseminates this information to the rest of the network and CH fault recovery process is initiated. 4) Cluster-head fault recovery: Cluster head node fault recovery process starts immediately after a CH fault is detected. During the data transfer from the CH to the ZM, each CH also sends the list of its cluster members and their respective energy levels. This allows the ZM to have a complete view of the zone topology. When a faulty CH node is identified, all the cluster members associated with it are gradually informed about the CH failure. For the CH recovery operation, the ZM chooses a new CH from the cluster members list. This choice is based on each cluster member’s sensor nodes residual energy. Therefore a new selected CH node has the highest energy reserves in the cluster. The ZM then announces the ID of the new CH to the rest of the cluster members. The common sensors nodes will then sends the node_join_req to the new CH and also piggyback their energy level Elevel to the new CH.
IV. DISCUSSION We presented ZFTMA: self-organizing, energy-efficient, fault management architecture for WSNs. ZFTMA can be applied to the design of WSNs protocols and applications that require energy efficiency, fault tolerance, maximizing network lifetime and scalability. In this section, we discus and analyze the effectiveness of ZFTMA in terms of energy efficiency and fault tolerance for WSNs. Energy conservation and maximizing network life time is an important feature of the proposed architecture. In ZFTMA the energy cost of network self-organization is very small, since the CHs are initially selected randomly and autonomously through a random probability function. Unlike LCA [22] and HEED [19], in ZFTMA the CH selection decision is solely made by each node independent of other node, hence avoiding the excessive messages exchange with neighbouring nodes. Based on the Received Signal Strength (RSSI), sensor nodes automatically organize themselves into clusters in a distributed way. In ZFTMA, management tasks are distributed throughout the network through a hierarchy of different nodes (CM, ZM and CH). Management decisions are taken locally so that less information is delivered to the CM which reduces network traffic and node energy. Distributing management tasks across the network and limiting the amount of control traffic significantly, hence minimize energy-consumption. In ZFTMA, each CH node aggregates local data into a single packet and transmits it to its ZM. Each CH can also process local management information (such as allocation of TDMA schedules to nodes, detection of fail nodes, cluster table creation etc), and can decide what management information is critical and should be forwarded, thus eliminating the forwarding of un-necessary data to the base station. Moreover, sensor networks are data-centric; therefore, a CM can manage to retrieve data from a specific zone of the network. Moreover dividing the network into zones gives the advantage that CM can use the ZM to probe a certain zone or region of the network for data acquisition. To conserve energy-efficiency and maximizing network life time, ZFTMA architecture employs a load balancing technique that allows all nodes to function properly for a long period of time. This load balancing is achieved through CH rotation among sensor node in order to avoid the excessive use of certain nodes. This re-election method avoids selecting an energy-starving node as CH which might lead to the cluster becoming dysfunctional once the CH dies. For fault-tolerance, in our proposed architecture fault management functionalities are performed whenever a potential or permanent fault is detected in the network. Fault management is performed in a distributed way with local decision making. CH detects member sensor nodes faults, whereas ZM detect and recover from CH faults. This local decision making avoids the excessive data communication with the base station. In ZFTMA, CH rotation and re-clustering also provides fault tolerance against CH potential faults. A threshold Tthres is setup in a CH, when its energy level reaches the threshold, a rotation
process is triggered. A new CH is selected among the member sensor nodes based on highest residual energy. Re-clustering frequency has to be carefully selected to withstand expected failure rates. Because in some situation where the failure rate is low; frequent re-clustering may result in significant resource waste [19]. V. CONCLUSION AND FUTURE WORK Energy-efficient network self-organization and fault tolerance have been identified as key challenges in the design and operations of WSNs. To address these challenges, we proposed a Zone-based Fault-Tolerant Management Architecture (ZFTMA) for WSNs. The propose architecture is composed of two novel contributions: an energy-efficient network self-organization scheme and a fault management architecture that offers efficient fault detection and recovery mechanisms to make the network fault-tolerant. Our new zonebased network organization scheme, self-organize all sensor nodes into multiple clusters to construct a fully connected WSN with minimum resource utilization; hence saving energy and communication bandwidth. For fault-tolerance, we proposed a fault management architecture which carries out localized fault detection and recovery with CH rotation and load balancing through a hierarchy of nodes (Central Manager, Zone Manager, and Cluster-Head). Our work so far has some related challenges that need to be addressed, and which require further development and refinement in the proposed design. In our proposed scheme CH selection is solely based on simple random selection, but further investigatory research relating to other parameters (e.g. relative position of the CH node from the base station, size of the cluster etc) is also required. In addition, during CH rotation, frequent re-clustering may also limit the sensor node to perform its basic operations (sensing and data communication), which we will be looking into the future. Furthermore, In future, we plan to extend our proposed design by incorporating the mobility and autonomic fault management aspect in the context of network management system.
REFERENCES [1] J. Yick, et al., "Wireless sensor network survey," Computer Networks, vol. 52, pp. 2292-2330, 2008. [2] R. Krishnan and D. Starobinski, "Efficient clustering algorithms for self-organizing wireless sensor networks," Ad Hoc Netw., vol. 4, pp. 36-59, 2006. [3] M. Z. Khan, et al., "Design Considerations for Fault Management in Wireless Sensor Networks," presented at the The 10th Annual Conference on the Convergence of Telecommunications, Networking and Broadcasting, Liverpool, UK, 2009. [4] W. L. Lee, et al., Network Management in Wireless Sensor Networks: Handbook on Mobile Ad Hoc and Pervasive Communications American Scientific Publishers, 2006. [5] H. Liu, et al., "Fault-Tolerant Algorithms/Protocols in Wireless Sensor Networks," in Guide to Wireless Ad Hoc Networks, ed: Springer-Verlag London, 2009, pp. 265-295. [6] I. F. Akyildiz, et al., "A Survey on Sensor Networks," IEEE Communication Magazine, pp. 102-114, August 2002.
[7] A. Akbari, et al., "A Survey Cluster-Based and Cellular Approach to Fault Detection and Recovery in Wireless Sensor Networks," World Applied Sciences Journal, vol. 8, pp. 76-85, 2010. [8] M. Yu, et al., "Self-Managed Fault Management in Wireless Sensor Networks," presented at the The Second International Conference on Mobile Ubiquitous Computing, Systems, Services and Technologies, UBICOMM '08. , 2008. [9] A. T. Tai, et al., "Cluster-based failure detection service for large-scale ad hoc wireless network applications," presented at the International Conference on Dependable Systems and Networks, 2004. [10] G. Venkataraman, et al., "A Cluster-Based Approach to Fault Detection and Recovery in Wireless Sensor Networks," presented at the 4th International Symposium on Wireless Communication Systems, ISWCS'07, 2007. [11] C. Yao-Chung, et al., "Cluster based self-organization management protocols for wireless sensor networks," Consumer Electronics, IEEE Transactions on, vol. 52, pp. 75-80, 2006. [12] W. L. Lee, et al., "WinMS: Wireless Sensor NetworkManagement System, An Adaptive Policy-Based Management for Wireless Sensor Networks," School of Computer Science & Software Engineering, The University of Western Australia, CSSE Technical Report UWA-CSSE-06-001, June 2006. [13] M. Ding, et al., "Localized fault-tolerant event boundary detection in sensor networks," presented at the INFOCOM 2005, 24th Annual Joint Conference of the IEEE Computer and Communications Societies, 2005. [14] L. Yongxuan and C. Hong, "Energy-Efficient Fault-Tolerant Mechanism for Clustered Wireless Sensor Networks," presented at the Proceedings of 16th International Conference on Computer Communications and Networks, ICCCN'07, 2007. [15] M. Yu, et al., "Autonomic Networking in Wireless Sensor Networks," ed, 2009, pp. 261-284. [16] C. Hsin and M. Liu, "A Two-Phase Self-Monitoring Mechanism for Wireless Sensor Networks," Journal of Computer Communications special issue on Sensor Networks, vol. 29, pp. 462-476, February 2006. [17] Y. Mengjie, et al., "Fault Management in Wireless Sensor Networks," IEEE Wireless Communications, vol. 14, pp. 13-19, 2007. [18] W. B. Heinzelman, et al., "An application-specific protocol architecture for wireless microsensor networks," IEEE Transactions on Wireless Communications, vol. 1, pp. 660-670, 2002. [19] O. Younis and S. Fahmy, "HEED: a hybrid, energy-efficient, distributed clustering approach for ad hoc sensor networks," IEEE Transactions on Mobile Computing, vol. 3, pp. 366-379, 2004. [20] W. Yongcai, et al., "Energy-driven adaptive clustering data collection protocol in wireless sensor networks," presented at the International Conference on intelligent Mechatronics and Automation, 2004. [21] S. Gamwarige and C. Kulasekere, "Optimization Of Cluster Head Rotation in Energy Constrained Wireless Sensor Networks," presented at the FIP International Conference on Wireless and Optical Communications Networks (WOCN'07), 2007. [22] K. Akkaya and M. Younis, "A survey on routing protocols for wireless sensor networks," Ad Hoc Networks, vol. 3, pp. 325349, 2005.