uniform data dissemination, in our future work we will focus on a broad range of ...... then position of the receiver is computed using some multilateration tech-.
Towards A Holistic Approach for Protocol Development in Sensor Networks
BY SAMEER TILAK M.S., SUNY BINGHAMTON, 2002 B.E., Pune Institute of Computer Technology, 1999
THESIS Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate School of Binghamton University State University of New York 2005
Abstract Technological advances in VLSI, MEMS, and wireless communication have ushered in a new age of miniature, low cost, low-energy, micro-sensors. Networks of such devices, called Wireless Sensor Networks (WSNs), hold the promise of revolutionizing sensing across a range of civil, scientific, military and industrial applications. The potentially high impact of this technology and the complex challenges posed by it have spurred intense interest in the research, military and commercial communities. However, sensors often have limited energy, computational ability, and storage capacity. Therefore protocols that manage the different aspects of the sensors’ operation in terms of collecting and processing data, as well as support services such as localization and synchronization have to be developed and must work efficiently within the constraints of the limited available resources — this is an extremely challenging task. The first contributions of my research are in developing application-specific, light-weight, energy efficient protocols for various critical sensor subsystems and services including: information dissemination, storage management, and localization. Based on the experiences with these diverse applications, subsystems and services, we represent the basic sensor network design goal as a balance between the application-level utility of operations vs. the cost of operations in terms of resources. Quantifying these values (utility and cost) allows sensor nodes to determine how to carry out decisions (data collection and dissemination, as well as services related operations) most effectively at each sensor. However, the local estimate of utility and resource cost may not be accurate when considered globally. As a result, we have also introduced the concept of context, which represents some globally available information that has significant influence on the local estimate of either utility or resource cost. We formalize these ideas and investigate the challenges in applying them in a real sensor network system. We call this approach a Holistic Approach for Protocol Design in Sensor Networks. Implementing the Holistic approach requires a novel abstraction that explicitly exposes the cost and utility of operations so the estimated benefit of a decision can be compared against its cost. To that end, we proposed a novel File-system based abstraction for sensor networks that has several desirable properties for both the system developers and application developers. A key feature of this abstraction is namespaces, which lets applications organize sensor networks in an application-specific manner, and we also advocate a standard resource namespace that exposes resource information, such as the available energy or storage space on a given sensor node, to the application . We demonstrate how a range of canonical sensor network applications can be built around the concept of namespace to provide richer yet resource efficient interfaces to sensor network users. Finally, we developed a scalable energy-efficient resource discovery framework so that heterogeneous sensors can carry out resource discovery in a platform independent manner.
Contents 1
2
3
Introduction 1.1 Sensor Networks: Applications and Challenges . . . . . . . . . . 1.1.1 Opportunity and Applications . . . . . . . . . . . . . . . 1.1.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Motivation and Overview of Contributions . . . . . . . . . . . . . 1.2.1 Sensor Network Subsystems and Services . . . . . . . . . 1.2.2 Towards a Holistic Approach for Sensor Network Systems 1.2.3 File System Abstraction and Resource Discovery . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
7 . 7 . 8 . 8 . 8 . 9 . 10 . 10
Background 2.1 Components of sensor network infrastructure 2.2 Sensor platforms . . . . . . . . . . . . . . . 2.3 Sensor Network Taxonomy . . . . . . . . . . 2.4 Micro-sensor Network Components . . . . . 2.5 Sensor Network Architecture . . . . . . . . . 2.6 Communication Models . . . . . . . . . . . . 2.7 Data Delivery Models . . . . . . . . . . . . . 2.8 Network Dynamics Models . . . . . . . . . . 2.9 Canonical Applications of WSN . . . . . . . 2.10 Performance Metrics/Design Goals . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
12 12 20 21 21 22 22 23 24 25 27
Information Dissemination in WSN 3.1 Introduction . . . . . . . . . . . . 3.2 Application-specific Design Goals 3.3 Dissemination Protocols . . . . . 3.3.1 Flooding . . . . . . . . . 3.3.2 Deterministic Protocols . . 3.3.3 Randomized Protocols . . 3.4 Experimental Study . . . . . . . . 3.4.1 Traffic Load Study . . . . 3.4.2 Mobility Study . . . . . . 3.5 Discussion . . . . . . . . . . . . . 3.6 Related Work . . . . . . . . . . . 3.7 Conclusions and Future Work . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
29 29 30 31 31 31 33 34 35 41 44 45 46
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
1
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
4
5
6
Collaborative Storage Management 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 Energy-Storage Tradeoff Space . . . . . . . . . . . . . . . . . 4.4 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5 Storage Management Protocols . . . . . . . . . . . . . . . . . 4.5.1 Storage Approaches . . . . . . . . . . . . . . . . . . 4.5.2 Collaborative Storage Protocols for Data Aggregation 4.5.3 Coordination for Redundancy Control . . . . . . . . . 4.6 Experimental Evaluation . . . . . . . . . . . . . . . . . . . . 4.6.1 Storage and Energy Tradeoffs . . . . . . . . . . . . . 4.6.2 Storage Balancing Effect . . . . . . . . . . . . . . . . 4.6.3 Effect of the Aggregation Model . . . . . . . . . . . . 4.7 Conclusion and Future Work . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
48 48 49 50 51 52 52 53 54 55 57 58 59 61
Dynamic Localization Control for Mobile Sensor Networks 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Motivation – Mobile Sensor Applications . . . . 5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . 5.3 Problem Definition: Location Tracking . . . . . . . . . . 5.4 Dynamic Localization Protocols . . . . . . . . . . . . . 5.4.1 Static Tracking . . . . . . . . . . . . . . . . . . 5.4.2 Adaptive Tracking . . . . . . . . . . . . . . . . 5.4.3 Predictive Tracking . . . . . . . . . . . . . . . . 5.5 Error Analysis: Constant Velocity Mobility . . . . . . . 5.5.1 Constant Velocity Mobility Scenario . . . . . . . 5.6 Experimental Results . . . . . . . . . . . . . . . . . . . 5.6.1 Energy-Accuracy Tradeoff . . . . . . . . . . . . 5.7 Effect of change in mobility pattern . . . . . . . . . . . 5.8 Localization Mechanism associated trade offs . . . . . . 5.9 Concluding Remarks . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
63 63 64 64 65 66 66 66 67 68 68 70 72 73 75 76
Towards a Holistic Approach for system design in Sensor Networks 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Utility-Based Sensor Network Design . . . . . . . . . . . . . . . . 6.3 Utility, Cost and Context . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 Benefit Estimation . . . . . . . . . . . . . . . . . . . . . . 6.3.2 Data significance . . . . . . . . . . . . . . . . . . . . . . . 6.3.3 Data quality . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.4 Cost Estimation . . . . . . . . . . . . . . . . . . . . . . . . 6.3.5 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.4 Additional Challenges . . . . . . . . . . . . . . . . . . . . . . . . 6.5 Preliminary Architecture . . . . . . . . . . . . . . . . . . . . . . . 6.6 Design trade offs . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7 Literature Survey . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.1 Survey of Middleware for Networked Embedded Systems . 6.7.2 Microeconomics-inspired approaches in distributed systems
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
77 77 78 79 79 80 80 81 82 86 87 91 92 92 94
2
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
6.8 7
8
9
Implementation Challenges and Future Work . . . . . . . . . . . . . . . . . . . . 98
A Filesystem Abstraction for WSN 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . 7.2 A Filesystem Abstraction of Sensor Networks . . . . 7.3 Architecture . . . . . . . . . . . . . . . . . . . . . . 7.4 Research Challenges . . . . . . . . . . . . . . . . . 7.5 Additional Capabilities . . . . . . . . . . . . . . . . 7.6 Examples . . . . . . . . . . . . . . . . . . . . . . . 7.6.1 Sensor Monitoring and Calibration . . . . . . 7.6.2 Data-Centric Application . . . . . . . . . . . 7.6.3 Heterogenous Response System Architecture 7.7 Implementation . . . . . . . . . . . . . . . . . . . . 7.8 Conclusion . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
99 99 101 102 103 104 105 105 105 106 107 108
Dynamic Resource Discovery for Wireless Sensor Networks 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Motivation and Background . . . . . . . . . . . . . . . 8.3 Sensor Platforms . . . . . . . . . . . . . . . . . . . . . 8.4 Dynamic Resource Discovery in Sensor Networks . . . . 8.4.1 Determining Tracked Attribute Set . . . . . . . . 8.4.2 Resource Discovery Protocol . . . . . . . . . . . 8.5 Architecture . . . . . . . . . . . . . . . . . . . . . . . . 8.5.1 Discussion . . . . . . . . . . . . . . . . . . . . 8.6 Experimental Study . . . . . . . . . . . . . . . . . . . . 8.6.1 Simulation Testbed . . . . . . . . . . . . . . . . 8.6.2 Simulation Results . . . . . . . . . . . . . . . . 8.7 Related Work . . . . . . . . . . . . . . . . . . . . . . . 8.8 Concluding Remarks . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
109 109 110 112 113 113 114 114 117 119 119 120 121 122
. . . . . . . . . . .
Conclusion and Future Work 124 9.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
3
List of Figures 2.1 2.2 2.3 2.4
WSN infrastructure components . . . Examples of Sensors and Actuators . MICA2 and MICA2DOT series Motes MICAZ Series Mote . . . . . . . . .
3.1
3.7
Grid Topology: Mean absolute error as a function of distance for different source data rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Grid: Weighted energy-accuracy tradeoff. . . . . . . . . . . . . . . . . . . . . . Random Topology: Mean absolute error as a function of distance for different source data rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tx= 150 m (Grid Topology): Mean absolute error as a function of distance for different source data rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mobile sensors (speed 2 m/sec): Mean absolute error as a function of distance for different source data rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mobile sensors (speed 10 m/sec): Mean absolute error as a function of distance for different source data rates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Mobile sensors: Weighted error-energy tradeoff. . . . . . . . . . . . . . . . . . .
4.1 4.2 4.3 4.4 4.5
Storage space vs. Network Density . . . . . . . . Energy Consumption and Collection Time Study Percentage of Storage Depleted Sensors vs. Time. Storage Space vs. Aggregation Ratio . . . . . . . Biased Deployment vs. Coverage . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
56 58 59 60 61
5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 5.10 5.11
Mobile Sensor with Localization Points. . . . . . . . . . . . . . State Diagram for MADRD . . . . . . . . . . . . . . . . . . . . Error with no deviation . . . . . . . . . . . . . . . . . . . . . . Error for deviation of θ degrees . . . . . . . . . . . . . . . . . . Errors in SFR and MADRD . . . . . . . . . . . . . . . . . . . Absolute Error as a function of Mobility and Pause Time . . . . Error: Speed (4-5 m/s). . . . . . . . . . . . . . . . . . . . . . . Energy Spent, Speed (0.5-1 m/s) . . . . . . . . . . . . . . . . . Percentage accuracy study a function of mobility and pause time. Energy Spent, Speed (4-5 m/s). . . . . . . . . . . . . . . . . . . Effect of pause time on MADRD. . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
65 67 68 69 69 71 72 73 74 74 75
6.1 6.2
Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Holistic Framework Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.2 3.3 3.4 3.5 3.6
. . . .
4
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
13 14 16 16
. 37 . 38 . 39 . 40 . 42 . 43 . 44
7.1 7.2 7.3
An example wireless sensor network in a zoo. Sensors track animal locations and resources such as food and water. The network is divided into two clusters, each consisting of a cluster head. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 Namespace for a sensor network. . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 An S&R system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
8.1 8.2
Smart Mall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Cluster based sensor network architecture. . . . . . . . . . . . . . . . . . . . . . . 115
5
List of Tables 2.1 2.2 2.3 2.4
Energy Characteristics (Flash Memory) . . . . . Energy Characteristics (IBM 340 MB Microdrive) Radio Energy Characteristics . . . . . . . . . . . Sensor Platforms. . . . . . . . . . . . . . . . . .
3.1
Non-uniform information dissemination study: simulation parameters. . . . . . . . 34
5.1
Localization study: simulation parameters. . . . . . . . . . . . . . . . . . . . . . . 71
8.1 8.2
Dynamic resource discovery: simulation parameters. . . . . . . . . . . . . . . . . 121 Energy consumption study of various resource discovery protocols . . . . . . . . . 121
6
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
15 15 17 21
Chapter 1 Introduction Technological advances in VLSI, MEMS, and wireless communication have ushered in a new age of miniature, low cost, low energy micro-sensors. Networks of such devices, called Wireless Sensor Networks (WSNs), hold the promise of revolutionizing sensing across a range of civil, scientific, military and industrial applications. This emerging technology provides an opportunity to collect information at unprecedented resolution due to their in-situ sensing capability, low cost, small size, and ease of deployment. If the anticipated vision is fully realized, then in the future we will have devices as small as a grain of sand, equipped with sensors, computational ability, a wireless radio, and a power supply. These devices will be so inexpensive that we will have tens of thousands of such smart wireless sensors embedded within our physical environment. These sensors, because of their selfconfiguring nature, low-power operations, and small form factor will perform in-situ and nonintrusive sensing of the real world at temporal and spatial scales which were previously simply unimaginable. These sensing devices will provide the interfaces between the physical and digital worlds [141]. Scientists believe that access to such detailed fine-grained information will play a key role in answering several longstanding fundamental questions.
1.1
Sensor Networks: Applications and Challenges
A wireless sensor network is an emerging computing paradigm that combines distributed sensing, computing, and wireless communication. The Internet has transformed the way people and communicate with each other, institutions function, and information gets exchanged. In the same way, it is widely anticipated that embedded wireless sensor networks as a disruptive technology that promises a paradigm shift in our understanding and control of the real world. Following are some of the natural questions that come to one’s mind. What are the killer applications of this technology? How easy is it to deploy and use? How costly is this technology? Is this technology already in ready-to-use state? In the first two chapters, we try to address some of these questions by presenting various sensor network applications, describing state-of-the-art infrastructure components (both hardware platforms and software elements), and discussing various research challenges posed by this emerging technology.
7
1.1.1 Opportunity and Applications Already, with this technology only a few years old, exciting applications are emerging. Scientists have started deploying WSNs as a non-intrusive tool for gathering data at high spatial and temporal resolutions from myriad environments ranging from dense forests and rivers to manufacturing plants and smart homes [189]. Several inter-disciplinary environmental monitoring applications have emerged. Prominent examples of such applications include habitat monitoring of various species of birds [56], rare plants [161], and animals [103]. Recently, structural health monitoring applications [92] have received considerable attention. In these applications sensors are deployed to monitor key civil infrastructures including bridges, tunnels, national highways, power stations, and water plants. These sensors can detect anomalous conditions and provide an early warning. State-of the art emergency response applications [65] include sensor networks as a crucial component of their architecture. Sensor networks can revolutionize the health care industry. For example, homes can be instrumented with sensors and actuators. These sensors can then monitor vital signs of patients and alert doctors if the patient requires immediate medical assistance. In the next chapter, in section 2.9, we give a detailed overview of canonical applications of WSNs.
1.1.2 Challenges The sample applications mentioned above represent the first generation of deployed WSNs. Eventually, large scale sensor networks are envisioned with thousands of tiny sensing devices embedded deeply within a complex physical environment. Many challenges remain before that vision can be achieved. Typically sensors have limited computing power, memory, and storage space. For example, Berkeley motes [131] have a 4 Mhz Atmel chip, 256 KB memory, and 4 Mb persistent storage. Also, sensor nodes are battery-operated, therefore energy efficiency is a primary concern. The scale and limited resources of the sensors, and their data centric nature present a unique set of challenges for sensor network developers. There is a premium on lightweight scalable energyefficient protocols. Features of such protocols are emerging: they must have localized interactions; this includes collaboratively processing the data in the network to reduce the amount of data that has to be relayed remotely. Other practical challenges include effectively programming such a widely distributed, heterogeneous and constrained system. Other challenges are introduced due to the nature of WSN deployment near the phenomena being observed, often in inhospitable environments. For example, one can imagine sensors being thrown from a flying plane in an inhospitable environment. In such cases, the sensors must self-configure within a reasonable time-frame. Also, in many cases, sensors communicate using wireless radios. The lack of infrastructure, low available bandwidth, unreliable links, high loss rates, and harsh physical environments pose significant challenge for protocol development. Therefore designing protocols for such resource constrained, large-scale distributed systems is a daunting task. We overview sensor networks and design challenges in more detail in Chapter 2.
1.2
Motivation and Overview of Contributions
The potential high impact of this technology and the complex challenges posed by it have spurred intense interest in the research, military and commercial communities. Several projects (in both
8
academia and industry [56, 103, 188]) ranging from networks of a few sensors (on the order of tens) to more ambitious ones (on the order of a few hundred sensors) are already deployed. These projects represent a first generation of these networks, emphasizing functionality (that is; can this be done at all?). However, as sensor networks evolve, designers have to be able to operate them effectively, meeting the application requirements while considering the sensors’ limited energy, computational ability, and storage capacity. In the remainder of this section, we overview the contributions of this dissertation in detail.
1.2.1 Sensor Network Subsystems and Services In this area, the focus was on developing application-specific, light-weight, and energy efficient protocols for various core sensor subsystems and services. These are overviewed in this section. Information Dissemination In this work, we focused on a set of real-time applications where the information is not collected centrally. Instead, event information is propagated within the network such that a set of mobile users can be notified of important events. Such applications include rescue and battlefield scenarios. We developed protocols that take advantage of the following property: necessary precision and freshness of information is proportional to the distance between an information producer and an information consumer. We refer to such a requirement as a non-uniform information dissemination requirement, a new concept that we introduce. We proposed and analyzed several protocols that perform non-uniform information dissemination. This study is presented in Chapter 3. Storage Management In this work, we consider a class of sensor networks where the data are not required in real-time by an observer; for example, a sensor network monitoring a scientific phenomenon for later play back and analysis. In such networks, the data must be stored in the network. Thus, in addition to battery power, storage is a primary resource: the useful lifetime of the network is constrained by its ability to store the generated data samples. We explore the use of collaborative storage techniques to efficiently manage data in storage constrained sensor networks. The protocols and their experimental evaluation is presented in Chapter 4. Dynamic Localization Control for Mobile Sensor Networks Localization is a fundamental operation in mobile and self-configuring networks such as some classes of sensor networks and mobile ad hoc networks. For example, sensor location is often critical for data interpretation. Moreover, network protocols such as geographic routing and geographic storage require individual sensors to know their coordinates. Existing research focuses on localization mechanisms: algorithms and infrastructure designed to allow the sensors to determine their location. Particularly, localization has been studied for static sensor networks, where sensors are stationary throughout the lifetime of the network. For such networks, localization is a one-time (or low frequency) activity; a sensor finds its location once and uses it in all of its future readings. In contrast, we consider localization for mobile sensors: when sensors are mobile, localization must be invoked periodically to enable the sensors to track their location. We proposed and investigated adaptive and predictive protocols that control the frequency of localization based 9
on sensor mobility behavior to reduce the energy requirements for localization while bounding the localization error. This study is presented in Chapter 5.
1.2.2 Towards a Holistic Approach for Sensor Network Systems Based on our experiences with designing protocols for sensor networks – those described above as well as others in our previous work [191] – we have noticed the following recurring observation. Due to the premium placed on the limited available resources, effective sensor network operation requires a careful balance between the application requirements and the resource cost of operations. At a lower level, whenever a sensor node has to carry out an operation that requires resources, it has to evaluate the benefit of carrying out this operation against the cost of doing the operation. The sensor node should use the balance of these two factors to decide when to invest in data collection and management operations as well as services such as localization and synchronization. Ideally, the evaluation of utility and resource cost would be carried out in a global sense. However, energy-efficiency and scalability concerns invite distributed solutions with localized interactions [64]. Thus, the sensors must use local estimates of utility and cost. However, in some cases, these local estimates may not be accurate when viewed globally: a piece of data signalling an important event may locally be scored highly in terms of utility; however, if the same data is being produced by many sensors, the local estimate is not accurate. To account for the divergence between global and local estimates of utility and resource cost, we introduce the notion of context. Context refers to some globally available information that significantly moderates the local estimate of utility or cost. For effective operation, in some instances, the gap between the local and global estimates should be reduced – context must be tracked. We discuss the framework and outline challenges that must be addressed before it can be put into practice. Our initial work in defining this framework and the associated challenges that remain are presented in Chapter 6.
1.2.3 File System Abstraction and Resource Discovery Finally we propose a file system abstraction for sensor networks and present a dynamic resource discovery protocol. From our experience with the design of different sensor network subsystems, we have observed that existing abstractions for sensor networks such as database abstractions often interfere with effective design. For example, in database abstractions, often the resource/structural aspects of the sensor network are not exposed to the system developer – inefficient query execution may result. Moreover, the heterogeneity and resource constraints of sensor networks significantly challenge system and application development. Successfully addressing these multi-dimensional challenges relies crucially on developing an effective abstraction for sensor networks. A simple and well understood abstraction can significantly ease both system development and application development. Many sensor networks are deployed by scientists and researchers whose domain of expertise is not computer science. To that end, we propose a flexible, intuitive file system abstraction for organizing and managing sensor network systems based on the Plan 9 design principles which espoused that the file system metaphor (as seen, for example, in the /proc file system) can be adopted for almost all aspects of system design and development. A key feature is the support of multiple views of the system via filesystem namespaces. Constructed logical views present an application-specific representation of the network, thus enabling high-level programming. Concurrently, structural views of the network
10
enable resource-efficient planning and execution of tasks. In essence, it enables us to expose the resource cost and application level utility in an efficient and unified manner. In Chapter 7, we present and motivate the proposed abstraction using several examples, outline research challenges and our plan to address them, and describe the current implementation state. Finally, we believe that this abstraction will provide sufficient flexibility to serve as part of the holistic framework that we proposed in the previous subsection. Another enabler for heterogeneous sensor network application development is the ability of the sensor network to discover resources in dynamic environments. To that end, we developed and evaluated protocols for carrying out this functionality. This functionality may also be viewed as one of the components of the proposed holistic framework that enables it to discover and normalize resources in a heterogeneous sensor network. Further details regarding this work can be found in Chapter 8. Finally, in Chapter 9, we summarize the contributions of this dissertation and discuss our future research plans.
11
Chapter 2 Background Future smart environments will be characterized by multiple nodes that sense, collect, and disseminate information about the real world through a wireless network. There is a wide range of applications for sensor networks with differing requirements. Despite the relatively recent emergence of sensor networks as a field of study, already a large number of sensor hardware platforms and software elements (operating systems, networking protocols, database systems, etc.) have emerged. In this chapter we present an overview of the sensor network components in a bottom-up fashion, starting from hardware capabilities of individual sensor nodes up through the different sensor network subsystems, and conclude with examples of sensor network applications. The remainder of this chapter is organized as follows: Section 2.1 presents a detailed overview of components of sensor network infrastructure. Section 2.2 presents a range of emerging sensor platforms and summarizes their hardware and software capabilities. After describing low-level components and services, we next move to high-level classification of sensor networks. We begin by presenting some basic definitions that we use throughout this dissertation in Section 2.4 and then describe sensor network architectures in Section 2.5. Section 2.6 classifies the overall communication in two categories, namely application and infrastructure, whereas Section 2.7 presents various important data delivery models and Section 2.8 classifies network infrastructure as static and dynamic. Finally, Section 2.9 presents canonical sensor network applications followed by an overview of design goals/performance metrics for a typical sensor network protocol architecture in Section 2.10.
2.1
Components of sensor network infrastructure
Figure 2.1 shows important sensor network infrastructure components. • Hardware: A micro-sensor typically consists of the following five components: 1. Transducer 2. Microprocessor/Microcontroller 3. Memory and persistent storage 4. Transreceiver (radio) 5. Battery 12
Application
Programming Abstractions (Database/Filesystem)
Routing
Data Aggregation,compression
Data-Centric Storage
Collaborative signal processing
MAC
Localization
Time synchronizaiton-
Calibration
Storage
Operating System
Processor
Radio
Transducer
Memory/Storage
Figure 2.1: WSN infrastructure components
13
Battery
(a) MTS300/310 Multi-Sensor Board with light sensor, temperature sensor, and microphone
(b) Digital Linear Actuator
Figure 2.2: Examples of Sensors and Actuators There are large differences in terms of the sensor hardware capabilities ranging from deeply embedded nodes that are considerably resource constrained to richer PDA or laptop class nodes with significantly more resources. In the following subsection, we describe each of the above components in detail. – Transducer: The IEEE 1451.2 standard [7] defines a transducer as a device that converts energy from one domain to another and is calibrated to minimize errors during the conversion process. Both sensors and actuators fall under this category. The IEEE 1451.2 defines a sensor as a component that provides a useful output in response to a chemical, biological and physical phenomenon. Typically a sensor converts physical, biological, or chemical parameters into an electrical signal. Figure 2.2(a) shows a multi-sensor board available from Crossbow Technology [131]. Typical examples of sensors include microphones, thermometers, position and pressure sensors. On the other hand, an actuator is defined as a transducer that provides a physical output in response to a stimulating variable or signal. An actuator typically accepts an electrical signal and converts it into a physical action. Figure 2.2(b) shows a digital linear actuator (from Siemens Corporation [13]) that regulates airflow in the throttle bypass valve to achieve optimal idling. Typical examples of actuators include digital sprinklers, loudspeakers, digital linear actuators, and fuel pressure regulators. – Microprocessor/Microcontroller: Each sensor node is equipped with a general purpose microprocessor (or an embedded microcontroller) on top of which the application runs. Modern microcontrollers typically integrate flash storage, RAM, A/D converters, and digital I/O onto a single integrated circuit. The microcontoller availability makes it possible to carry out in-network processing to adaptively optimize the network and improve its energy efficiency. The capabilities of the microcontroller as well as the available memory determine the types of processing that are feasible on the sensor nodes. 14
DC Voltage Transfer Current Transfer Power Transfer Rate Transfer Energy/Mbyte
3.3V 25 mA 0.0825 W 1.5MByte/s 0.055 J
5V 45 mA 0.225 W 1.5MByte/sec 0.15 J
Table 2.1: Energy Characteristics (Flash Memory)
DC Voltage Write Current Write Power Standby Current Standby Power Transfer Rate (max) Transfer Energy/MByte
+3.3 V +5V 300 mA 330 mA 0.99 W 1.65W 65 mA 80 mA 0.21 W 0.4 W 3MBytes/sec 3Mbytes/sec 0.33 J 0.55 J
Table 2.2: Energy Characteristics (IBM 340 MB Microdrive)
Most of the modern microcontrollers support multiple operating modes. For exmaple, ATmega 128(L) microcontroller [1] supports six sleep modes namely, idle, ADC noise reduction, power-save, power-down, standby, and extended standby. Intelligent management of CPU modes can lead to significant energy savings. Traditional low-voltage microcontrollers have 2.7 to 3.3 V operating range. Modern microcontrollers operate down to 1.8 V. One feature of several microcontrollers is their capability to dynamically adjust their operating frequency to allow further savings in power. ATmega 128(L) microcontroller supports software selectable clock frequency. – Memory and persistent storage: Memory provides the required space for computation. Most of the modern microcontrollers contain between 1-128 KB of on-chip program storage. This memory can be used to both store code as well as to store data. In real-time sensor network applications, data is stored for a short period; typically it is processed and forwarded toward an observer/base-station. Compact flash memories are the most promising technology for storage in sensor networks. They have excellent power dissipation properties compared to conventional secondary storage devices such as magnetic disks; the energy properties are discussed in more detail below. Moreover, the trend is for a continuous drop in their prices since they are becoming commodity due to their use in applications such as digital cameras; over the past two months ( (August-September 2005), the average price of Flash memory cards has dropped by half. For example, currently, the price of a 1Gbyte USB flash memory ($180) is only slightly more expensive than a 1Gbyte microdrive ($164). Finally, they have a smaller form factor than magnetic disks. Compact Flash memory does have some disadvantages, most notably in how they are accessed and the device lifetime, but these are not limiting for collaborative storage. Several of the 15
(a) MICA2 Series Mote
(b) MICA2DOT Series Mote
Figure 2.3: MICA2 and MICA2DOT series Motes
Figure 2.4: MICAZ Series Mote currently available sensor nodes include flash memories as a storage device. As a result, throughout this dissertation, we focus on flash memories exclusively as storage devices and profile some typical devices currently available on the market. Tables 2.1 and 2.2 list important power characteristics for a representative flash memory and IBM microdrive respectively. – Transreceiver - After collecting information about the phenomenon the sensor needs to communicate this information to the observer. The transreceiver can be a radio in case of a wireless micro-sensor network or it can represent a serial interface such as RS-232 for a wired environment. Figure 2.3 shows a MICA2 series and a MICA2DOT series mote. The term Mote stands for processor/radio boards. Recent advancements have made it possible to equip commercial sensors with IEEE 802.15.4 compliant, lowpower, high data rate (250 Kbps) radios (shown in Figure 2.4). Table 2.3 lists important energy characteristics for a IEEE 802.15.4 compliant radio. 16
Table 2.3: Radio Energy Characteristics Device 802.15.4/ZigBee Transmit Power 0.0552 W Receive Power 0.0591 W Idle Power 0.00006 W Transmit Rate 250kbyte/s Transfer Energy/Mbyte 1.766 J
– Battery - Battery power is required in order to support all the hardware. There are three prominent sources of battery technologies that can be useful while building a WSN– Alkaline, Lithium, and Nickel Metal Hydride. Further details on their energy characteristics can be found in [90]. In some cases ambient energy can be harvested to extend the battery life; for example, the use of solar energy [12], vibration energy [168], and wind energy have been explored. For example, solar cells can produce energy from sunlight, which if stored can be used during night time operation. • Operating system: The primary goal of the operating system is to provide high-level abstractions of low-level hardware components. It also controls access to hardware. Extreme resource scarcity in the case of deeply embedded micro sensors require specialized models of operating systems that differ significantly from those used in timeshared systems. On the other hand, more resource rich sensor nodes such as Stargate [186], typically have some variant of the Linux operating system installed on them. We now give a high-level description of TinyOS. TinyOS [90, 91] is an event-driven and component-based operating system that is designed specifically for resource constrained sensor nodes. TinyOS has a very small memory footprint. The TinyOS system, libraries, and applications are written in nesC [11], an extension to the C programming language. nesC promotes programming structured component-based applications. nesC has two types of components namely modules and configurations. A nesC application consists of one or more components statically linked (also called wired) together to form an executable. TinyOS executes only one program consisting of selected system components needed for a single application. TinyOS supports two threads of execution namely tasks and hardware event handlers. Once scheduled, tasks run to completion and do not preempt other tasks. Hardware event handlers run in response to a hardware interrupt. A hardware event handler may preempt the execution of a task or other hardware event handler. TinyOS is an open-source operating system, and it has a very active and global user community. • Medium Access Control (MAC) protocol: Traffic in a WSN is collaborative (and not endto-end as is typical in the Internet). Because of the need for energy efficiency and selfconfiguration, as well as the relatively low importance of traditional metrics such as per-node fairness and latency, one cannot simply use traditional, heavy-weight MAC layer protocols such as 802.11 [142]. Motivated by this demand, researchers have proposed and evaluated several novel medium-access control (MAC) layer protocols. Polastre et al. [153] classify MAC layer protocols into two basic classes: slotted protocols and sampling (or contention) protocols. In the case of slotted protocols, time is divided 17
into slots and radio modes are scheduled as either receiving, transmitting or idle in terms of these slots. Slotted protocols include the TDMA protocol family, S-MAC [215, 216] and T-MAC [198], IEEE 802.15.4 [8], and others [28]. In the case of sampling based protocols, there is no notion of time slots and coordinated schedules; nodes periodically wake up and contend for the channel. Examples of sampling MAC include B-MAC [152] and WiseMAC [22]. A survey of MAC layer protocols in sensor networks can be found in [21]. • Localization: Localization service enables a sensor node to find its coordinates. Coordinates can be either physical coordinates or virtual coordinates. Localization is a fundamental operation in mobile and self-configuring networks such as sensor networks and mobile ad hoc networks. Sensor location is often critical for data interpretation. Moreover, network protocols such as geographic routing and geographic storage require individual sensors to know their coordinates. Not surprisingly, there has been considerable research on localization mechanisms: algorithms and infrastructure designed to allow (static but autonomous) sensors to determine their location. For static sensor networks, localization is a one-time (or low frequency) activity; a sensor finds its location once and uses it in all of its future readings. Section 5.2 presents a brief overview of state-of-the art localization techniques. In contrast, we consider localization for mobile sensors: when sensors are mobile, localization must be invoked periodically to enable the sensors to track their location. To that end, we proposed and investigated adaptive and predictive protocols that control the frequency of localization based on sensor mobility behavior to reduce the energy requirements for localization while bounding the localization error. Further details of this study can be found in Chapter 5. • Time synchronization: Time synchronization is a fundamental service in any distributed system. Particularly, in the context of WSNs, time synchronization is needed for various activities including data aggregation (e.g., for distributing beamforming array [213]), in some localization techniques [77], for event detection and coordination, and system debugging. However, the large network scale, the energy and computational constraints, and the unattended nature of the network render existing time synchronization protocols such as NTP [133], the de facto clock synchronization protocol for Internet, impractical. Significant research has focused on techniques for efficient time synchronization [59, 60, 61, 75, 74, 199, 107, 123]. • Calibration: Whitehouse et al. [209] describe calibration as a process of forcing a device to conform to a particular input-output mapping. WSN properties such as the large scale of the network, unattended nature, and dynamic and partially unobservable environment (in some cases even fully unobservable) make calibration a real and challenging problem. To that end, researchers have proposed novel techniques for auto-calibration in sensor networks [94, 208, 209]. • Storage: As described earlier in Section 2.1, a sensor node typically provides limited persistent storage. To that end, in Chapter 4, we consider the following problem: how to use limited persistent storage of a sensor to store sampled data effectively. A more detailed review of storage management issues in sensor networks can be found in [194]. • Collaborative signal processing: Since each sensor has limited computational power and bandwidth capabilities and different tasks have different accuracy requirements, researchers 18
have argued for a Multi-resolution spatio-temporal processing of sensor signals. Further details on collaborative signal processing can be found in [111]. • Routing: There exists significant research in terms of routing algorithms for mobile, ad-hoc networks, such as DSR [102], AODV [147], DSDV [148], LAR [110]. However, several features of WSNs, such as their data-centric nature and the premium on resource usage and collaborative traffic motivate the need for a new set of routing protocols. Therefore, routing in sensor networks has been an active area of research. A variety of Geographic routing protocols [41, 66, 106, 124, 136, 211, 212, 218] and datacentric routing algorithms [97, 112] have been proposed and evaluated in the context of WSNs. Heinzelman et al. [86] proposed LEACH, a clustering based energy efficient data gathering protocol. LEACH is more suitable for static periodic sensor networks described in Section 2.7. Researchers have also proposed routing protocols geared toward real-time applications in WSNs. Examples of such protocols include RAP [126], SPEED [84], and JITS [124]. Also, there has been some research [117, 124, 184] in scheduling and coordination in real-time sensor networks. One can imagine that an information gathering and dissemination system would reside on top of a basic routing framework, receiving basic host to host connectivity service from it. In Chapter 3, we describe one such information dissemination scheme. • Data-centric storage: Data indexing and retrieval is one of the challenges in storage management in sensor networks. Many applications of sensor network are likely to be data-centric; observers would name data in terms of attributes or content. For example, a commander may be interested in enemy tank movements. This data-centric characteristic of sensor networks is similar to many peer-to-peer (P2P) environments [27]. Two fundamental approaches for data indexing and retrieval are structured and unstructured. In the case of the structured approach, data are placed at specific locations (e.g., using hashing on keys) to make retrieval more efficient [187, 158]. While, in the case of the unstructured approach, data are not tied to any pre-computed specific location [163]. Accordingly, searching for data is difficult since it may reside anywhere within the network. Researchers have proposed and evaluated techniques including random walk and its variants for data lookup. We believe that an unstructured approach might be too expensive for a WSN. Ratnasamy et al. [156] proposed Geographic Hash Table (GHT) system, a structured P2P solution for data centric storage (DCS) in sensor networks. Although GHT provides an equivalent functionality to structured P2P systems, it addresses several new challenges posed by resource-constrained and dynamic environments such as node failure and network topology changes. • Data aggregation and compression: It is widely accepted that transporting raw data from sensor nodes to the base station is energy inefficient. Since sensors are equipped with processors, researchers have argued for pushing intelligence within the network, termed as innetwork processing to reduce the size of the data close to its sources. It typically consists of various data processing activities that sensors within a network perform ranging from data summarization, aggregation and filtering. For example, Directed Diffusion [97] supports the use of application-specific filters, commonly written in a low-level programming language, 19
for in-network processing. On the other hand, TAG [128] takes a database-centric approach and argues for use of a well-known declarative query language and database operators for in-network data aggregation. • Programming abstractions: WSNs typically consist of a diverse set of hardware and software elements. Hardware elements include a wide variety of different sensor and actuator types, ranging from COTS to highly-specialized, one-of-a-kind parts. Software elements draw from numerous domains, including the natural sciences, artificial intelligence, sensor networks, and embedded systems. The heterogeneity and resource constraints of typical WSNs pose daunting challenges to system and application development. In many cases, WSNs will be deployed and used by scientists (e.g., ecologists) with limited programming skills. The above challenges are further exacerbated by the lack of simple abstractions for the use and development of these systems. Having simple-to-use, familiar high-level abstractions for sensor networks can significantly ease the design and deployment of real-world sensor network applications. A commonly proposed abstraction of WSNs is that of a database [214]. In this dissertation we propose and argue that a filesystem abstraction can be applied to WSNs. The proposed abstraction shields application developers from the complexity and heterogeneity of the underlying infrastructure and results in flexible, and intuitive system. • Applications: Sensor networks are application-specific networks. A detailed knowledge of system-level components (hardware capabilities, software functionalities, and system services) as well as the underlying differences between micro-sensor applications is needed to successfully build and deploy efficient real-world sensor networks. In this section, we overviewed important system-level components (both hardware components and system services). In the next section, we present a taxonomy of sensor networks. We believe that this taxonomy will aid network designers in making better decisions regarding the organization of the network, and choice of the network protocol and information dissemination models. After presenting taxonomy, in Section 2.9, we give several examples of canonical sensor network applications.
2.2
Sensor platforms
Despite the relatively recent emergence of sensor networks as a field of study, already a large number of sensor hardware platforms and software elements (operating systems, networking protocols, data base systems, etc.) have emerged. There is a large variety in the capabilities and features of the proposed systems. In this section, we briefly overview typical characteristics of the current generation of sensors. Sensor hardware platforms vary in capability from miniature sensors such as the Berkeley motes [131], which are equipped with 8-bit microprocessors with a few kBytes of memory and low bandwidth, low range radios, to PDA class sensors such as PASTAs [144]. Other sensor platforms include Mantis [129, 24], and WINS [164]; Table 2.4 presents a comparison of various sensor platforms. On the software side, diversity exists at various layers, including operating system, programming languages, and communication protocols. For example, TinyOS [91, 90], MANTIS OS (MOS) [129], RTOS [6], Linux, and Windows CE are all used as operating systems on the different 20
Table 2.4: Sensor Platforms. Platform Processor CPU speed RAM Max. Data Rate Operating system
MICA ATMega103L 4 MHz, 8 bit 4 kB 40(kb/s) TinyOS
MicaZ ATMega128L 7.37 MHz, 8 bit 4 kB 250(kb/s) TinyOS
Mantis Nymph ATMega128L 7.37 MHz, 8 bit 4 kB + 64 kB 76.8 (kb/s) MOS
RockWell WINS Intel-StrongArm 133 Mhz 1MB 100 (kb/s) Win CE
Stargate Intel-Xscale 400 MHz 64 MB 11(Mb/s) Linux
platforms. At the MAC layer S-MAC [215], TDMA [86], IEEE 802.11 and various other protocols exist. Directed diffusion [97], RAP [126], SPEED [84], and LEACH [86] represent some of the proposed routing and data gathering protocols. Due to various technical and economical reasons a sensor network deployed by an organization can consist of a diverse set of sensors. In fact, existing deployments such as Duck Island [56] capitalize on the heterogeneity of the sensors within the network. Sensor networks deployed by independent organizations might have heterogeneity even to a greater extent. This great diversity presents significant challenges to sensor network interoperability and motivates the need for resource discovery. We believe that this heterogeneity is inevitable, and, furthermore, we believe that an attempt to impose uniformity on this diversity would stifle experimentation and innovation. But without some commonality, interoperability is not possible. To that end, in Chapter 8, we present a resource discovery protocol for heterogeneous, large-scale next generation sensor networks, and we outline some of the associated challenges. Our approach embraces adoption of minimal, extensible standards that can promote interoperability by supporting the discovery of formats and protocols.
2.3
Sensor Network Taxonomy
We now present a classification of wireless micro-sensor networks according to factors that are most relevant to communication. The focus on communication is due to its dominant effect on energy consumption for the current generation of sensors. Further, communication through an unreliable and shared medium represents a unique challenge to wireless networks such as WSNs. We examine the components of a typical micro-sensor network as well as the different types of communication that are required to carry out the essential functionalities. We then compare different data delivery models and network dynamics to create a taxonomy of wireless micro-sensor network communication. We believe that this taxonomy will aid network designers in making better decisions regarding the organization of the network, the network protocol and information dissemination models. Furthermore, it will aid in developing realistic sensor network models and benchmarks for use in future sensor network research.
2.4
Micro-sensor Network Components
Throughout this dissertation, we use the following terminology
21
• Sensor: The device that implements the physical sensing of environmental phenomena and reporting of measurements (through wireless communication). Typically, it consists of five components– sensing hardware, memory, battery, embedded processor, and trans-receiver (as described in Section 2.1). • Observer: The end user interested in obtaining information disseminated by the sensor network about the phenomenon. The observer may indicate interests (or queries) to the network and receive responses to these queries. Multiple observers may exist in a sensor network. • Phenomenon: The entity of interest to the observer that is being sensed and potentially analyzed/filtered by the sensor network. Multiple phenomena may be under observation concurrently in the same network. In a sensing application, the observer is interested in monitoring the behavior of the phenomenon under some specified performance requirements (e.g., accuracy or delay). In a typical sensor network, the individual sensors sample local values (measurements) and disseminate information as needed to other sensors and eventually to the observer. The measurements taken by the sensors are discrete samples of the physical phenomenon subject to individual sensor measurement accuracy as well as location with respect to the phenomenon.
2.5
Sensor Network Architecture
A sensor network can be thought as a tool for measuring and relaying information about the phenomenon to the observer within the desired performance bound and deployment cost. As such, the organization of the network may be viewed as follows: 1. Infrastructure: The infrastructure consists of the sensors and their current deployment status. More specifically, the infrastructure is influenced by the characteristics of the sensors (e.g., sensing accuracy, memory size, battery life, transmission range) and deployment strategy (e.g., sensor density, sensor location, sensor mobility). 2. Network Protocol: The network protocol is responsible for creating paths and accomplishing communication between the sensors and the observer(s). 3. Application/Observer: The observer(s) interests in the phenomenon are queries from the observer(s) about the phenomenon as approximated by the distributed data that the sensors are capable of sensing. These queries could be static (the sensors are preprogrammed to report data according to a specific pattern) or dynamic. The network may participate in synthesizing the query (for example, by filtering some sensor data or fusing several measurements into one value); we consider such intelligence to be part of the translation process between observer interests and low-level implementation.
2.6
Communication Models
There are multiple ways for a sensor network to achieve its accuracy and delay requirements; a well designed network meets these requirements while optimizing the sensor energy usage and providing fault tolerance. By studying the communication patterns systematically, the network
22
designer will be able to choose the infrastructure and communication protocol that provide the best combination of performance, robustness, efficiency and deployment cost. Conceptually, communication within a sensor network can be classified into two categories: application and infrastructure. The network protocol must support both these types of communication. Application communication relates to the transfer of sensed data (or information obtained from it) with the goal of informing the observer about the phenomena. Infrastructure communication refers to the communication needed to configure, maintain and optimize operation. More specifically, because of the ad hoc nature of sensor networks, sensors must be able to discover paths to other sensors of interest to them and to the observer regardless of sensor mobility or failure. Thus, infrastructure communication is needed to keep the network functional, ensure robust operation in dynamic environments, as well as optimize overall performance. We note that such infrastructure communication is highly influenced by the application interests since the network must reconfigure itself to best satisfy these interests. As infrastructure communication represents the overhead of the protocol, it is important to minimize this communication while ensuring that the network can support efficient application communication. In sensor networks, an initial phase of infrastructure communication is needed to set up the network. Furthermore, if the sensors are energy-constrained, there will be additional communication for reconfiguration. Similarly, if the sensors are mobile or the observer interests dynamic, additional communication is needed for path discovery/reconfiguration. For example, in a clustering protocol, infrastructure communication is required for the formation of clusters and cluster-head selection; under mobility or sensor failure, this communication must be repeated (periodically or upon detecting failure). Finally, infrastructure communication is used for network optimization. Consider the Frisbee model, where the set of active sensors follows a moving phenomenon to optimize energy efficiency [42]. In this case, the sensors wake up other sensors in the network using infrastructure communication. Sensor networks require both application and infrastructure communication. The amount of required communication is highly influenced by the networking protocol used. Application communication is optimized by reporting measurements at the minimal rate that will satisfy the accuracy and delay requirements given sensor abilities and the quality of the paths between the sensors and the observer. The infrastructure communication is generated by the networking protocol in response to application requests or events in the network. Investing in infrastructure communication can reduce application traffic and optimize overall network operation.
2.7
Data Delivery Models
Sensor networks can be classified in terms of the data delivery required by the application (observer) interest as: continuous, event-driven, observer-initiated and hybrid. These models govern the generation of the application traffic. In the continuous model, the sensors communicate their data continuously at a pre-specified rate. The authors in [86] showed that clustering is most efficient for static networks where data is continuously transmitted. For dynamic sensor networks, depending upon the degree of mobility, clustering may be applicable as well. In the event-driven data model the sensors report information only if an event of interest occurs. In this case, the observer is interested only in the occurrence of a specific phenomenon or set of phenomena. In the observer-initiated (or request-reply) model, the sensors only report their results in response to an explicit request from the observer (either directly, or indirectly through other sensors). Finally, the three approaches can coexist in the same network; we refer to this model as the hybrid model. 23
The above classification focuses on data delivery from the application perspective, and not the actual flow of data packets between the sensors and the observer; this is a routing problem subject to the network protocol.
2.8
Network Dynamics Models
Static Sensor Networks In static sensor networks, there is no motion among communicating sensors, the observer and the phenomenon. An example is a group of sensors spread for temperature sensing. For these types of sensor networks, studies have shown that localized algorithms can be used in an effective way [97, 86]. The sensors in localized algorithms communicate with nodes in their locality. An elected node relays a summary of the local observations to the observer, perhaps through one on more levels of hierarchy. Such algorithms extend the lifetime of the sensor network because they trade-off local computation for communication [86]. In this type of network, sensor nodes require an initial set-up infrastructure communication to create the path between the observer and the sensors with the remaining traffic exclusively application communication1 .
Dynamic Sensor Networks In dynamic sensor networks, either the sensors themselves, the observer, or the phenomenon are mobile. Dynamic sensor networks can be further classified by considering the motion of the components. This motion is important from the communications perspective since the degree and type of communication is dependent on network dynamics. We believe that each of the following require different infrastructures, data delivery models, and protocols: • Mobile observer. In this case the observer is mobile with respect to the sensors and phenomena. An example of this paradigm is sensors deployed in an inhospitable area for environment monitoring. For example, a plane might fly over a field periodically to collect information from a sensor network. Thus the observer, in the plane, is moving relative to the sensors and phenomena on the ground. • Mobile sensors. In this case, the sensors are moving with respect to each other and the observer. For example, consider traffic monitoring implemented by attaching sensors to taxis. As the taxis move, the attached sensors continuously communicate with each other about their own observations of the traffic conditions. If the sensors are co-operative, the communication paradigm imposes additional constraints such as detecting the link layer addresses of the neighbors and constructing localization and information dissemination structures. Previous work [97] argues that the overhead of maintaining a globally unique sensor ID in a hierarchical fashion like an IP address is expensive and not needed. Instead, these sensors should communicate only with their neighbors with the link layer MAC address. In such networks, the above-mentioned proactive algorithm with local patching for repairing a path can be used so that the information about the phenomenon is always available to the observer regardless of the mobility of the individual sensors. 1
Note that if energy is limited among the nodes, the network will require infrastructure communication to maintain a path between the observer and the phenomenon as nodes run out of energy.
24
• Mobile phenomena. In this case, the phenomenon itself is moving. A canonical example of this paradigm is sensors deployed for animal detection/tracking. In this case the infrastructure level communication should be event-driven. Depending on the density of the phenomena, it will be inefficient if all the sensor nodes are active all the time. Only the sensors in the vicinity of the mobile phenomenon need to be active. The number of active sensors in the vicinity of the phenomenon can be determined by application specific goals such as accuracy, latency, and energy efficiency. A model that is well-suited to this case is the Frisbee model [42]. • Hybrid. This represents some combination of the above three cases. For example, in Zebranet [103] sensors are attached to animal bodies to track their mobility pattern, nocturnal behavior and other important characteristics. In such a case, both sensors and phenomenon are mobile.
2.9
Canonical Applications of WSN
Although, it is impossible to enumerate and discuss all the existing and potential applications, in the remainder of this section, we classify the existing applications into three classes namely, environmental monitoring, search and rescue/battlefield operations, and target tracking. We believe that these classes encompass a significant, if not all, fraction of sensor network applications. • Environmental monitoring One can imagine thousands of tiny, unattended sensors embedded in complex physical environments. Typical examples of such applications include marine microorganism monitoring, study of algal boom, habitat monitoring of various species of birds [56], rare plants [161], and animals [103]. Zebranet [103] is an inter-disciplinary project with a focus on zoology and computer systems. Zebranet aims to enable zoologists to study migration patterns, social structures, and a range of other socio-ecological factors of various animal species by letting them collect detailed information in a large area about numerous animals at low cost. The remote Ecological Micro-Sensor Network [161], which aims at remote visual surveillance of federally listed rare and endangered plants, is a similar example. The network provides the near-real time monitoring of important events such as visitation by pollinators, consumption by herbivores and even human visits, along with monitoring a number of weather conditions and events. In this case, sensors are placed in a variety of habitats, ranging from scattered low shrubs to dense tropical forests with severe environmental conditions, e.g., some locations frequently freeze. Other example are the James reserve project at UCLA [42] and the Great Duck Island [56] project at UC Berkeley. The EarthScope project [5] aspires to address longstanding and fundamental questions about earthquake physics, volcanic processes, geodynamics, and crustal fluids. It is a continentscale seismic observatory consisting of a range of geophysical instruments including digital seismic arrays, strainmeters, magnetotelluric sensors and GPS receivers. It can significantly expand our capabilities to observe the structure and ongoing deformation of the North American continent. Improved understanding of the structures and processes that affect our environment can provide several key insights in terms of hazards assessment and more precise estimates of natural resource potential. 25
Ongoing interdisciplinary research [190] has a potential to revitalize the life of millions of people leaving in Central Africa by freeing them from fatal diseases, including river blindness. These scientists are planning to deploy tiny sensors in river beds to raise an alarm over the presence of larvae. These cheap sensors can offer a cost-effective alternative to the existing method of spreading larvacide over thousands of kilometers. Recently, structural health monitoring applications [92] have received considerable attention. In these applications sensors are deployed to monitor key civil infrastructures including bridges, tunnels, national highways, power stations, and water plants. These sensors can then continue monitoring for various anomalous conditions such as gas leakage, water contamination, and bridge cracks. Providing early warning about an impending tragedy can significantly improve our response to emergency situations. Sensor networks can revolutionize the health care industry. For example, homes can be instrumented with sensors and actuators. These sensors can then monitor vital signs of patients and alert doctors if the patient requires immediate medical assistance. The Center for Future Health Smart Medical Home project[17] is an example of such effort. On the industrial front, sensors can be deployed all over the a manufacturing plant to help supervisors monitor the manufacturing process at various stages in real-time. Sensors can inform of equipment failures, manufacturing abnormalities, and other critical events including fire and electrical hazards within fractions of a second thereby potentially saving millions of dollars and priceless human lives. • Search and rescue and battlefield operations Crisis can arise due to several factors including the spread of fire, water contamination, the release of hazardous substances, flood, and earthquake. The main objective of an effective emergency and disaster relief management effort is to save precious lives and minimize the damage to critical infrastructure. Ongoing projects such as Firegrid [65] aim to develop effective real-time response systems to tackle emergencies that arise due to fire. Real-time data streamed from sensors in emergency locations is routed to the crisis management system running on a computational grid. This data then drives the selection of appropriate models and simulations to predict the evolution of fire. The results from simulations are used for many things, including estimating the impact of fire on structures (e.g., to predict an impending collapse), guiding evacuation strategies, and providing necessary feedback to control deployed sensors. • Target tracking: Emerging technology in the form of passive and active Radio Frequency Identification (RFID) tags and sensors can significantly improve security of goods and inventory. With the help of data reported from sensors, manufacturers can collect instantaneous and continuous data. This can have a significant impact on their businesses. For example, they can detect inefficiencies in the current process, identify improvements in resource allocation procedures, control inventory more effectively and can inter operate with other major vendors. Also, sensors can be deployed in transportation vehicles, warehouses and refrigerators, and shelves. These sensors can continuously monitor various parameters such as temperature, pressure and humidity and can be programmed to raise an alarm upon detecting an important event. For example, sensors can trigger an alarm if the temperature in a refrigerator falls outside a prescribed level. Such an early warning system can prevent potential mishaps such as food poisoning. 26
These applications have received interest from both academia as well as industry [23]. Researchers at UC Berkeley call such widely distributed systems whose edges consist of numerous receptors such as sensor networks and RFID readers and whose interior nodes consist of traditional host computers organized using the principle of successive aggregation as HiFi–High Fan-in architectures [73, 48]. The project aims to provide novel data management techniques for such systems.
2.10
Performance Metrics/Design Goals
After overviewing the details of WSN infrastructure and applications, a natural question would be how to select an appropriate sensor network infrastructure and protocol architecture to solve a problem at hand. One way to do that is by finding out how given infrastructures would perform for that application and then choosing the one that performs the best among them. This method then automatically leads to another question: how would one evaluate the performance of a given network protocol. In Chapter 6 we discuss explicitly balancing benefit and cost as a criteria for decision making in sensor networks. One of the challenges there is to synthesize these and other measures of performance into overall estimates of utility and cost.” There are two fundamental ways to evaluate network protocol architectures namely, via networkcentric metrics and via user-centric metrics. We now describe both these categories and enumerate important metrics belonging to each of the two categories. Shenker et. al [177] argued that network performance should not be measured in terms of network-centric quantities such as power and scalability, but it should be solely measured in terms of the degree to which the network satisfies the service requirements of each of its end users. Following are the network-centric evaluation metrics. • Energy efficiency. As sensor nodes are battery-operated, protocols must be energy-efficient to maximize system lifetime. • Low resource usage: As described earlier, sensors have limited computational power and memory, therefore protocol should be simple. It should also have low configuration/maintenance cost. • Scalability. Scalability for sensor networks is also a critical factor. For large-scale networks, distributed protocols are needed. The protocol should be based on localized interactions and should not need global knowledge such as the current network topology. For example, a protocol that requires a given sensor to have up-to-date knowledge of topology of the entire network will require a lot of communication and will not scale well with an increase in number of nodes in the network. • Fault tolerance: The protocol should be adaptive so that the failure of a few sensors during disconnection can be tolerated. We now describe canonical user-centric performance evaluation metrics. Further details on use of user-centric performance metrics can be found in Chapter 6. • Network lifetime: Ideally, the sensor network should gather useful data as long as possible. The literature is replete with protocols that try to maximize the network lifetime [180, 178, 87, 103, 212, 215, 218, 198, 156]. 27
• Accuracy: Accuracy is an application-centric metric for evaluating the performance of a network protocol architecture. Typically it can be defined in terms of an application-specific function such as RMS error, deadline, quality of data, percentage of queries satisfied, freshness, or some other arbitrarily complex function. For example, in our information dissemination study discussed in Chapter 3, we propose using weighted error for measuring accuracy of information reported by a sensor network. We assume that as the distance between the information producer and consumer increases, loss in precision is acceptable (higher error is tolerable). • Coverage: Meguerdichian et. al [130] defined coverage as the measure of quality of service of a sensor network. One way to define coverage qualitatively is how well a given sensor network can observer a specific area and detect events occurring within it in a reasonable time frame. • Cost and ease of deployment: One of the main advantagea of sensor networks is their ease of deployment and its ability to self-configure in the absence of any infrastructure. For example, one can imagine sensors thrown randomly from a flying plane in a remote inhospitable environment. Once on the ground, these sensors will form a multi-hop wireless network in an autonomous fashion. Conner et. al [47] argued that even in indoor environments where it is possible to wall-power sensors, the cost of running new wires is expensive and time consuming. Also, one can imagine that running wires all over the home would require considerable effort as compared to replacing sensor batteries once in a while. In outdoor environments, either it would not be possible to power sensors off AC-mains (for example, in an inhospitable environment) or the cost would be prohibitively high. We have studied random and grid deployment strategies for sensor network deployments in outdoor environments for both the event driven and continuous data delivery models [193]. In our settings, we observed no difference in the performance of the network architecture (using both application-centric and network-centric metrics). Therefore, we argued that random deployment is preferable to a grid-like uniform deployment due to the reduced cost and deployment effort. Often, these goals are interrelated and in some cases, they might even conflict. For example, accuracy and cost and ease of deployment are related in certain cases. Consider a wild-life tracking application. One can build this application in the following two ways namely, by attaching sensors to animals, and by deploying sensors in the forest which detect and report if an animal comes in their range. Attaching sensors to animals might require more cost and effort (catching animals for attaching sensors or replacing batteries), but might have higher accuracy. On the other hand, deploying sensors in the forest would require less cost and effort, but might have less accuracy. We believe that such issues need to be resolved by discussions among sensor network designers and domain scientists. In the next three chapters, we will look into several critical sensor subsystems namely, informationdissemination, storage, and localization.
28
Chapter 3 Information Dissemination in WSN 3.1
Introduction
In this study, we focus on a set of WSN applications that are event-driven 2.3 events generated in the network must be disseminated to multiple interested observers, whose locations are not known a priori. Typically, sensor networks send information to a single place for analysis, taking into consideration optimizations from local aggregation (e.g., LEACH [88]) or processing data en route to a central location (e.g., MagnetOS [181]). While such central collection is important for many applications, it does not match the requirements for many event-driven applications. For example, consider a military application with sensors distributed throughout an area where they collect information regarding events such as passing vehicles, air contaminant levels, and the presence of land mines. We assume that the sensors can communicate with one another, and a soldier that moves throughout the region can contact any nearby sensor to find out both the state of that sensor, as well as any other event information that may be important for the soldier collected from the other networked sensors. For this soldier, clearly the events occurring in the immediate neighborhood are most important. For example, it is more important to know about a nearby land mine than one several miles away. Nonetheless, it is still important that the soldier has a general overview of the area in order to plan and make appropriate decisions. Similarly, consider a rescue scenario where a team of fire fighters is working to rescue trapped victims. In this case, the fire fighters require precise information about their immediate surroundings in order to make decisions about using resources to make progress, as well as some information in nearby areas and even in the whole operation field to plan a path to the victims as well as an escape path back to safety. In the above applications, the users are mobile and connect to the nearby sensors to obtain the required information; approaches such as publish-subscribe [97] are challenged by subscriber mobility. The applications above differ from those typically studied for sensor network applications in that the information is not collected centrally, but instead it is utilized at several places in the network (e.g., the locations of the individuals). While some sensor network applications accomplish this in a query driven manner, asking a central source for the latest collected information, these applications require continuous updates. A simplistic solution to this problem is to pro-actively flood updates from each sensor to every other sensor. This solution is extremely inefficient and does not scale to large numbers of sensors. In the scenarios we target, information from a particular sensor is most important to those surrounding it, with the value of the information decreasing as a function of distance from the sensor. Specifically, the necessary precision of information is proportional to the distance between an information producer and an information consumer. We refer to such a 29
requirement as a non-uniform granularity of information dissemination requirement. We take advantage of the context information to implement information dissemination with non-uniform granularity. More specifically, we build on the intuition that the value of the event information drops proportionately to the distance away from the event. This context, in this case the location of the origin of the event, is embedded with the packet that originates from the information source in the form of TTL (time to live). Thus, as the distance between the source node and the sink node increases, loss in information precision is acceptable – distance from an event serves as context (the lower the value of TTL the higher the distance from the source) that moderates the decision to disseminate information or not. The contributions of this chapter are two-fold: (1) we define information dissemination with non-uniform information granularity; and then (2) we describe different protocols that achieve non-uniform information dissemination and analyze these protocols based on complexity, energy consumption, and accuracy of information. As the proposed protocols are intended to run on wireless sensor networks, they must abide by the requirements of that environment; namely: they must be energy-efficient and have low complexity. The distinguishing feature of this new application class is that it is possible to trade accuracy of disseminated information for energy. Our experimental results clearly show this tradeoff using a number of different protocols. The remainder of this chapter is organized as follows. Section 3.2 characterizes the requirements for non-uniform information dissemination protocols. Section 3.3 describes the details of several protocols. In section 3.4, we discuss the implementation details of the protocols within the ns-2 simulator and then present our experimental results, followed by a discussion in section 3.5, which presents more insight into our results. Section ?? describes our attempt to apply non-uniform information dissemination technique to other domain, namely the Grid. Section 3.6 describes related work and section 3.7 presents conclusions and long-term future work.
3.2
Application-specific Design Goals
In section 2.10 we described canonical design goals for sensor network protocols. In this section, we propose the following additional design goal for sensor network protocols for applications that have non-uniform information dissemination requirements. • Accuracy. Accuracy is a measure of the sensing fidelity: obtaining accurate information is the primary objective of a sensor network. Accuracy is a metric that is application-specific both in terms of the appropriate metric and the required fidelity level. There is a tradeoff between accuracy, latency and energy efficiency. In the applications we target, it is acceptable for sensors to have information with low accuracy about locations that are far away, but they should have highly accurate information about locations that are close by. Because of this non-uniform information dissemination requirement, a given sensor will not have all the information from all other sensors at every point in time. Consider a case where sensor S1 receives every nth packet from another sensor S2 . In this case, S1 receives the ith packet from S2 at time t1 and the (i + n)th packet from S2 at time t2 , (n > 1). Thus, in the interval (t1 ,t2 ), the information S1 has about S2 is not as accurate as a sensor that receives every update that S2 sends. We define accuracy in terms of the difference between the local value of the information and the actual value. 30
With these design goals in mind, in this chapter we present simple deterministic protocols (Filtercast and RFiltercast) and non-deterministic protocols (unbiased and biased protocols) to achieve non-uniform information dissemination. Compared to flooding, these protocols reduce the cost of communication by reducing the number of packet transmissions and receptions. At the same time, these protocols are designed to operate within the application-specific tolerance in terms of accuracy. Our results indicate that these protocols outperform flooding in terms of energy efficiency by trading-off accuracy for energy while keeping the accuracy acceptable by the application. The next section describes the details of each of these protocols.
3.3
Dissemination Protocols
This section presents our proposed protocols that perform non-uniform information dissemination. Similar to traditional sensor networks, every sensor in the network serves as a source of information to be spread throughout the network. Unlike traditional networks where a specific node serves as the sink node, every sensor in our system receives and stores some data from the other sensors in the system. We begin our protocol discussion with a traditional flooding algorithm. Flooding achieves uniform information dissemination, and serves as a baseline of comparison for the rest of our protocols. Following this, we introduce two new deterministic protocols and analyze two nondeterministic protocols [191].
3.3.1 Flooding In flooding, a sensor broadcasts its data, and this is received by all of its neighbors. Each of these neighbor sensors rebroadcasts the data, and eventually each sensor in the network receives the data. Some memory of packets is retained at each sensor to ensure that the same packet is not rebroadcast more than once. If each sensor broadcasts its data, then with this flooding protocol, every sensor in the network will receive data from every other sensor. Thus, ignoring distribution latency, which is the amount of time required for a packet to travel from the source to the farthest sensor in the network, every sensor has an identical view of the network at every point in time (ignoring packet collisions and timing issues). Furthermore, the protocol itself is simple and straightforward to implement. Unfortunately the simplicity and high accuracy come at the price of high energy expenditure. The massive data replication requires active participation from every sensor in the network, and thus sensors can quickly run out of energy.
3.3.2 Deterministic Protocols In analyzing the flooding algorithm, it is apparent that to achieve non-uniform information dissemination, one approach is to simply have intermediate nodes forward fewer packets. The two protocols we introduce here, Filtercast and RFiltercast, achieve just that by deterministic means. Filtercast Filtercast filters information at each sensor by selectively forwarding information received from other sensors. Filtercast is based on a simple idea of sampling information received from a given source at a certain rate n, specified as a parameter to the protocol. The lower the value of n, the 31
more accurate the information disseminated by the protocol. When n = 1, Filtercast behaves identically to flooding. During protocol operation, each sensor keeps a count of the total number of packets it has received so far from each source, sourcecnt . A sensor forwards a packet that it receives from source only if (sourcecnt mod n) == 0, then increments sourcecnt . We refer to the constant n1 as the filtering frequency. The intuition behind this protocol is that as the hop count between a source node and a sink node increases, the amount of information re-transmitted decreases due to the cascading effect of the filtering frequency at each subsequent sensor. While this reduces the total number of transmissions compared to flooding, the state information maintained at each sensor increases. Specifically, each sensor must maintain a list of all the sources it has encountered from the start of the application and a count of the number of packets seen from each of these sources. As this increases linearly with the size of the network, it may pose some scalability problems. RFiltercast One potential problem with Filtercast is the synchronization of the packets transmitted by the neighbors. For example, consider a scenario where sensors s2 and s3 are one-hop neighbors of both s1 and s4 , while s1 and s4 are two hops away from each other. In this case, if Filtercast is used as a dissemination protocol, then s2 and s3 will end up forwarding either all odd or all even packets (synchronized on forwarding the packets) from s1 to s4 , effectively transmitting redundant information. Our intuition is that if we can remove this redundancy, we may be able to increase the accuracy of the protocol without increasing the energy expended. To address this effect, we propose another protocol: Randomized Filtercast (RFiltercast). In this variant of Filtercast, the filtering frequency n1 is still the same for all sensors, but each sensor generates a random number r between 1 . . . n and re-transmits a packet if (sourcecnt mod n) − r == 0. Intuitively, this means that each sensor considers a window of size n and will transmit only one of the packets from a given source in this window. So, for a window of size 2, half of the packets will be selected for re-transmission, but instead of always re-transmitting the first of the two packets (as in Filtercast), the sensors that choose r = 1 will transmit the first of the two packets while the sensors that choose r = 2 will transmit the second of the two packets. While our intuition was that the same energy would be expended by RFiltercast as for Filtercast, this turns out not to be true. In fact, RFiltercast transmits more packets than Filtercast, but fewer than Flooding, putting its energy expenditure in between the two. This effect happens because in RFiltercast, a node receives more packets from a given source, as described above, and thereby ends up transmitting more packets on behalf of the source. Note that forwarding decisions at an intermediate node are not based on packet IDs but are based on the number of unique packets received from the source node. Therefore, due to the reception and transmission of more packets in RFiltercast, the energy dissipation of RFiltercast is higher than that of Filtercast. While RFiltercast has more transmissions, increasing its energy expenditure, it also has improved accuracy over Filtercast. The crucial point to extract is that RFiltercast should, on average, propagate information faster than Filtercast, leading to more accurate data throughout the network, but RFiltercast will require less energy than flooding.
32
3.3.3 Randomized Protocols Both RFiltercast and Filtercast are lightweight and easy to analyze due to their deterministic nature. However, they still have some overhead in terms of the state required at each node. We next describe several probabilistic protocols. In these protocols, when a sensor receives a packet, it chooses a random number and then decides whether to forward the packet or not based on the number chosen. We classify these protocols into two categories: biased and unbiased. In the unbiased protocol, all packets are forwarded with equal probability; in effect, this is gossipping [120]. In contrast, the biased protocol uses the context information (its distance from the event) to moderate the estimate of the utility of the data; the closer the sensor is to an event, the more aggressively it disseminates information about it. Unbiased Protocol Probabilistic dissemination of data packets throughout a network has been studied previously [32, 93, 170, 217], but to the best of our knowledge, no studies exist that explore its applicability to non-uniform information granularity requirements. Similar to the deterministic protocols, the unbiased protocol also takes a parameter that affects the accuracy of the forwarding. In this case, the parameter specifies the probability that a packet should be forwarded. In the case of the unbiased protocol, this value is the same for each incoming packet. The main advantage of this protocol is its simplicity and low overhead. As every packet is forwarded only with a certain probability, the protocol results in less communication compared to flooding (proportional to the forwarding probability). Also, the protocol does not require state to be kept, giving this protocol the potential to scale well. To adjust the accuracy of the information throughout the network, the forwarding probability can be tuned according to the application needs. The primary tradeoff, however, is energy for accuracy. In general, as the forwarding probability increases, the behavior converges toward flooding. While our current study considers only constant probabilities, our long-term goal is to look at the possibility of probabilities being adjusted dynamically to adapt to the current network traffic and the application needs. In the general case, a node may decrease its forwarding probability if it senses high traffic in its neighborhood or low battery power, or increase its probability to improve dissemination. Biased Protocol In this protocol, the forwarding probability is chosen to be inversely proportional to the distance the packet has traveled since leaving the source sensor. In other words, if a sensor receives a packet from a close neighbor, it is more likely to forward this than a packet received from a neighbor much farther away. To estimate distance between sensors, a sensor examines the TTL (time-tolive) field contained in the packet. If we assume all sensors use the same initial TTL, we can use the current TTL to adjust the forwarding probability for each packet. The following tuples indicate the forwarding probabilities used (second number in the tuple) when the packet has traveled the number of hops in the range specified in the first part of the tuple: < 1 − 3, 0.8 >, < 4 − 6, 0.6 >, < 7 − 9, 0.4 >, < 10+, 0.2 >. Estimating distance using TTL has negligible overhead. The computation is straightforward, and because a sensor must decrement the TTL field before forwarding the packet in any case, there is no additional overhead to extract the TTL information from the packet. Although TTL does not 33
Table 3.1: Non-uniform information dissemination study: simulation parameters. Simulation area Transmission range Initial Energy MAC Protocol Bandwidth Transmit Power Receive Power Idle Power Number of Nodes
800 × 800 m2 100 m 10000 J 802.11 1 M bps 0.660 W 0.395 W 0.0 W 100
always indicate the exact distance between two sensors, we believe that, in general, TTL can be used as a rough approximation for physical distance, and therefore is a valid metric for biasing our forwarding approach. For example, consider a source node S and a destination node D. It is possible that either due to congestion or collisions, a packet gets dropped along the shortest path and another packet reaches node D via a longer route. In that case, the TTL would give a false estimate of distance. However, in a static network, node D can always maintain its current estimate of the TTL to node S. If it ever receives a packet from S with a higher TTL (meaning a shorter path), it can update its existing value. However, maintaining distance estimates will not work in networks with mobile nodes. In the case of mobile sensors, as the distance between S and D changes (decreasing, for example, as the nodes get closer), this is reflected in the subsequent packets (higher TTL value) and thus node D gets more and more accurate information about S. We use the TTL-based approach for the biased protocol mainly for its simplicity, resilience to mobility and energy efficiency. Similar to the unbiased protocol, this biased protocol requires no additional storage overhead unless node distances are stored, and the protocol itself is completely stateless (note, however, that this does not eliminate the caching of recently seen packets in order to avoid re-broadcasting the same packet multiple times). Therefore, this protocol scales as well.
3.4
Experimental Study
In order to analyze the protocols, we use the ns-2 discrete event simulator [140]. Table 3.1 lists the relevant parameters used during our simulations. In the case of static networks, we consider two sensor deployment strategies: uniform and random. In a uniform deployment strategy, sensors are distributed with some regular geometric topology (e.g., a grid). With random deployment, sensors are scattered throughout the field with uniform probability. For a battlefield-like scenario, random deployment might be the only option, but with applications such as animal tracking in a forest, sensors may be deployed in a deliberate, uniform fashion. In order to simulate sensor readings, we divide the simulation into an initialization phase and a reporting phase. During the initialization phase, each sensor chooses a random number between 0 and 100 to serve as its initial sensor reading. During the reporting phase, each sensor increments its 34
reading by a fixed amount at fixed intervals. In the real world, due to correlation among physically co-located sensors, sensors will have a different reading pattern; however, this simulation does provide us with valuable information about the behavior of our protocols under various conditions. In the latter part of this section, we present a revised data model that tries to capture correlation among sensor readings. Our results indicate that the overall behavior of the protocols shows a very similar trend for both data models.
3.4.1 Traffic Load Study This study focuses on evaluating the effect of a change in traffic load for both grid and random topologies. In the first set of experiments, we study the effect of varying traffic loads systematically from 5 packets/sec to 1 packet/ 2 sec. The goal of these experiments is to understand the relationship between accuracy, reporting rate, and network capacity for both uniform and nonuniform dissemination scenarios. Note that in order to calculate accuracy, we find the difference between a sensor’s local view of another sensor’s data and the actual value of that sensor’s data. A view is essentially the latest data that one sensor has from another sensor. This view is then normalized based on distance. Let R(Si,j ) denote sensor Si ’s view of sensor Sj ’s data, and let n be the total number of sensors in the network. The weighted error ei for a sensor Si is given as: ei =
1 n Σ |(R(Si,j ) − R(Sj,j )| ∗ wij n j=1,j6=i wij =
1 dd(Si ,Sj )e γ
(3.1) (3.2)
where d(Si , Sj ) is the Euclidean distance between sensors Si and Sj and γ was set to 100 (the transmission range of each node). The first equation shows that for a given sensor we calculate weighted average error with respect to all other sensors in the network. We vary the weight in terms of distance with a step size of 100 meters. Note that Euclidean distance is used as the weighing factor so that the higher the distance, the smaller the contribution of error toward overall error. This error calculation describes our non-uniform data dissemination requirement by giving higher weight to errors for data that originated in a close neighborhood and lower weight to errors for data that originated from a distant sensor. It is worth noting that although we refer to this as error, because the value of the data at the source increases linearly, it also represents the accuracy of the data. Our results indicate that with flooding, congestion is a severe problem, and other protocols are less prone to the congestion problem. In this type of application, the effect of congestion is worse than that observed in traditional sensor networks [192]. From the simulation studies, we can see that flooding is the least energy-efficient protocol and has the highest error if the network is congested. RFiltercast and the biased protocol are more energy-efficient than flooding and provide low error in most cases. Filtercast and the unbiased protocol are the most energy-efficient protocols, but their accuracy is good (low error) only at higher sending frequencies. Grid Topology Figure 3.1 shows the performance of flooding, Filtercast, RFiltercast, and the biased and unbiased randomized protocols under various traffic loads for the grid topology. In these graphs, distance 35
is varied across the X-axis (in steps of 100 meters) and the Y-axis shows mean unweighted error (mean absolute error). Note that, with non-uniform information dissemination, as the distance between the source node and sink node increases, loss in information precision is acceptable. From Figure 3.1(a), where the data rate is 5 packets/sec, we can see that even though theoretically flooding should have no error, due to congestion, flooding has the highest error. This is due to the fact that if the total traffic exceeds the network capacity, congestion causes packets to be dropped and this gives rise to loss of information and high error. At the same time, high traffic results in higher collisions. In this situation, even RFiltercast and the biased randomized protocol result in high traffic load and thus they have high error as well. However, both Filtercast and the unbiased randomized protocol (with forwarding probability of 0.5) perform well in this case because the traffic load does not exceed the available network capacity. As expected, for all protocols the error increases as the distance from the source increases, resulting in non-uniform information across the network. When the sending frequency is changed to 2 packets/sec, as shown in Figure 3.1(b), flooding the network still causes congestion and thus flooding has high error. However, now for both RFiltercast and the biased protocol, the load does not exceed the network capacity and their performance is better than in the previous case. Also, note that now these two protocols perform better in terms of error rate than the unbiased protocol and Filtercast because of the fact that they disseminate more information yet the information disseminated does not exceed the network capacity. When the sending frequency is lowered to 1 packet/sec, as shown in Figure 3.1(c), then even flooding does not exceed network capacity. Since the network is no longer a bottleneck, flooding disseminates the maximum information successfully and clearly has the lowest error. Both the biased and RFiltercast protocols perform better than the unbiased protocol and Filtercast. The unbiased protocol and Filtercast have the highest error in this case because they do not disseminate as much information as the other protocols. The same trend continues even for the lowest sending frequency, shown in Figure 3.1(d). The interesting point about these results is the oscillatory behavior of the energy-error curves. To elaborate further on this, if the total data exceeds network capacity, then any further data on the channel will increase congestion and decrease overall lifetime of the network. When the amount of data transmitted is below network capacity, then there is a trade-off between energy spent and accuracy observed. This is because as long as the total data does not exceed network capacity, sending more data will improve accuracy at the cost of energy spent in communication. However, with non-uniform information granularity, accuracy between two sensors is proportional to distance between them. Therefore, RFiltercast and Filtercast try to achieve this by filtering packets and the randomized protocols try to achieve this by probabilistically forwarding packets. Figure 3.2 shows the trade-off between energy and weighted error,using the weighted error calculation method described in Eqns. 3.1 and 3.2. In Figure 3.2, the X-axis indicates the energy spent in Joules and the Y-axis shows mean weighted error. Each point represents one of the sending frequencies, ranging from 5 packets per second for the left-most point of each curve to one packet per two seconds for the right-most. For flooding, when the reporting frequency is highest, the energy spent is maximum. However, as mentioned earlier, congestion and collisions cause high error. As the sending frequency decreases to the point that total traffic does not exceed network capacity, the error also decreases. Flooding performs the best in terms of accuracy (minimum error) when the sending frequency is 1 packet/sec; after this rate, the error starts increasing due to the fact that not enough information is propagated. This is an interesting phenomenon, where the error oscillates between these two bounds. The upper bound is a function of the network capacity, whereas the lower bound is a func36
Average Error per 100 meters with frequency 5 packets/ sec
Average Error per 100 meters with frequency 2 packets/ sec
2
1.5 Flooding Filtercast RFiltercast Unbiased (p=1/2) Biased
1.8 1.6
Flooding Filtercast RFiltercast Unbiased (p=1/2) Biased 1
Absolute Error
Absolute Error
1.4 1.2 1 0.8
0.5
0.6 0.4 0.2 0 1
2
3
4
5
0 1
6
2
Distance (in multiples of 100 meters)
(a) Data rate 5 packets/sec.
4
5
6
(b) Data rate 2 packets/sec.
Average Error per 100 meters with frequency 1 packet/ sec
Average Error per 100 meters with frequency 1 packet/ 2 sec
2.5
4 Flooding Filtercast RFiltercast Unbiased (p=1/2) Biased
Flooding Filtercast RFiltercast Unbiased (p=1/2) Biased
3.5 3
Absolute Error
2
Absolute Error
3
Distance (in multiples of 100 meters)
1.5
1
2.5 2 1.5 1
0.5 0.5 0 1
2
3
4
5
0 1
6
Distance (in multiples of 100 meters)
2
3
4
5
6
Distance (in multiples of 100 meters)
(c) Data rate 1 packet/sec.
(d) Data rate 1 packet/ 2 sec.
Figure 3.1: Grid Topology: Mean absolute error as a function of distance for different source data rates.
37
−3
Mean Error at data rates (5,2,1,0.5 packets/sec)
3.5
Weighted Energy−Error Study: Grid Topology
x 10
Flooding Filtercast RFiltercast Unbiased (p=1/2) Biased
3
2.5
2
1.5
1
0.5
0 0
5
10
15
20
25
30
Energy Spent (J)
Figure 3.2: Grid: Weighted energy-accuracy tradeoff. tion of the application-specific accuracy. Previous research has also shown this phenomenon [192]. Based on the energy-error trade-off, we can say that at high sending frequency, flooding performs the worst by spending high energy while not providing accurate information (high error). RFiltercast and the biased protocol start performing better than flooding at high rates. There is a considerable difference between energy and error for RFiltercast and the biased protocol compared to flooding at the sending frequency of 2 packets/sec. As one can anticipate, flooding performs better than all other protocols in terms of accuracy when the sending frequency is 1 packet/sec, but note that there is not much difference between flooding, RFiltercast and the biased protocol even when the network is operating in the non-congested mode. Filtercast and the unbiased protocol perform best in terms of energy and error at high sending frequencies and their performance relative to the other protocols starts to degrade as the sending frequency is reduced. The desirable mode of operation for a protocol is in the region where minimum energy is spent and low error is observed. Note that the desired mode of operation for a protocol depends on factors such as network density, transmission range of the radios, etc.1 From Figure 3.2, this zone lies around sending frequency 2 packets/sec to 1 packet/sec for RFiltercast and the biased protocol, whereas for flooding it lies at sending frequency 1 packet/sec. We want to point out that as the network size increases, flooding can pose severe problems in terms of scalability and energy efficiency. Therefore, randomized protocols should be considered as viable alternatives in these cases. In our experiments we had a network of 100 sensors (for some simulations few hundred sensors), but with a network of thousands of sensors we believe that randomized protocols will perform much better than flooding. Also, the randomized protocols are more flexible since once can set their parameters (turn knobs appropriately)to balance applicationspecific accuracy with energy expenditure. Unfortunately, such optimizations are not possible with naive protocol such as flooding. With randomized protocols, the biased protocol performs the best by spending moderate energy and getting high accuracy. Our results show that randomized protocols can achieve high energy savings while at the same time achieving acceptable accuracy with almost no overhead. Also note that RFiltercast and the biased protocol have almost equivalent error curves while the biased protocol has negligible overhead. 1
In our future work, we will perform an analytical study to address this issue.
38
Average Error per 100 meters with frequency 5 packets/ sec
Average Error per 100 meters with frequency 2 packets/ sec
2
1.5 Flooding Filtercast RFiltercast Unbiased (p=1/2) Biased
1.8 1.6
Flooding Filtercast RFiltercast Unbiased (p=1/2) Biased 1
Absolute Error
Absolute Error
1.4 1.2 1 0.8
0.5
0.6 0.4 0.2 0 1
2
3
4
5
0 1
6
Distance (in multiples of 100 meters)
2
3
4
5
6
Distance (in multiples of 100 meters)
(a) Data rate 5 packets/sec.
(b) Data rate 2 packets/sec.
Figure 3.3: Random Topology: Mean absolute error as a function of distance for different source data rates.
Random Topology Figure 3.3 shows our results with a random topology and the same traffic loads as before. The results for 1 and 2 packets/sec and the energy tradeoff study show expected results similar to those achieved for the grid topology, and are therefore not shown here. It is not clear whether regular deployment will offer advantages over uniformly distributed random deployment; if it does not, random deployment is preferable because of its low cost. Transmission Range Our next set of experiments show the effect of an increase in the transmission range from the original 100 meters to 150 meters, while keeping the original 10x10 grid topology. An increase in transmission range corresponds to an increase in the degree (connectivity) of a sensor. This results in decreasing the capacity of the network, meaning congestion occurs even at low sending frequencies. Intuitively, this will make the overall situation worse if the network is operating in a congested mode. This can be seen from our results, comparing Figures 3.1(a) and 3.4(a), as there is an increase in overall error for flooding, RFiltercast and the randomized protocols. In this case, even at the low sending frequency of 1 packet/sec shown in Figure 3.4(c), flooding does not perform well due to network congestion. Previously, when the transmission range was 100 meters, flooding performed well at this sending frequency (see Figure 3.1(c)). However, when the network is not congested, then due to the higher connectivity and shorter average hop length, the average error decreases. For example, for RFiltercast and the biased protocol, when the sending frequency is 1 packet/sec, then the maximum absolute error values with a transmission range of 100 meters are 0.4 and 0.6 respectively, as shown in Figure 3.1(c). With the transmission range changed to 150 meters, Figure 3.4(c) shows that the maximum absolute error for RFiltercast and the biased protocol changes to 0.24 for both. 39
Tx range 150: Average Error per 100 meters with frequency 5 pack/ sec
Tx range 150: Average Error per 100 meters with frequency 2 pack/sec
3.5
4 Flooding RFiltercast Unbiased (p=1/2) Biased
3
3
Absolute Error
2.5
Absolute Error
Flooding RFiltercast Unbiased (p=1/2) Biased
3.5
2
1.5
1
2.5 2 1.5 1
0.5
0.5
0 1
2
3
4
5
0 1
6
2
Distance (in multiples of 100 meters)
(a) Data rate 5 packets/sec. Tx range 150: Average Error per 100 meters with frequency 1 pack/sec
6
Flooding RFiltercast Unbiased (p=1/2) Biased
1.2
1
Absolute Error
1.4
Absolute Error
5
Tx range 150: Average Error per 100 meters with frequency 1 pack/ 2 sec 1.4
Flooding RFiltercast Unbiased (p=1/2) Biased
1.6
4
(b) Data rate 2 packets/sec.
2 1.8
3
Distance (in multiples of 100 meters)
1.2 1 0.8 0.6
0.8
0.6
0.4
0.4 0.2 0.2 0 1
2
3
4
5
0 1
6
Distance (in multiples of 100 meters)
2
3
4
5
6
Distance (in multiples of 100 meters)
(c) Data rate 1 packet/sec.
(d) Data rate 1 packet/ 2 sec.
Figure 3.4: Tx= 150 m (Grid Topology): Mean absolute error as a function of distance for different source data rates.
40
Similarly, all the protocols have low error values at a sending frequency of 1 packet/ 2 sec when the transmission range is 150 meters, as shown in Figure 3.4(d), compared to the simulations with 100 meters transmission range, shown in Figure 3.1(d). Up to this point our study considered static networks. In the next subsection we analyze protocols for non-uniform information dissemination in the presence of mobility along with a revised data model.
3.4.2 Mobility Study To motivate the case for mobile sensors, consider a battlefield scenario, where soldiers and armed vehicles are moving carrying tiny sensors along with them. Each sensor is collecting information about air contaminants so as to find out about potential biological/chemical attacks. In this case as a soldier moves around, the presence of an air contaminant sensed by the sensor changes depending upon the sensor’s current location in the battlefield. Also, for a small change in location, there is not a very high change in the percentage of contaminant reported. We can think of this as a spatio-temporal process, where there is both spatial and temporal correlation among the readings reported by the sensors. This means that correlation among the sensor readings is a function of the distance between them; the closer the sensors are, the higher the correlation between their data. In order to model this application, we divide the simulation area of 800x800 meters into 16 squares, each called a zone. We assume that the sensor readings (contaminant in this case) follow a normal distribution in space (inter-zonal distribution). For the inter-zonal distribution, we set the mean to 20 and the standard deviation to 2. This corresponds to a loose correlation among data sensed by all the sensors in the battlefield. Also, there will be very high correlation among data reported by sensors within the same zone. We modeled the correlation among sensors within a zone (intra-zonal) to follow a normal distribution but with low variance compared to that of the inter-zonal distribution. For the intra-zonal distribution we set the standard deviation to be a random number in the interval of 0 to 0.5 (both inclusive). Also, we vary the mean of the intrazonal distribution as the simulation progresses to reflect the temporal variations. During the initial half of the simulation, the mean slowly increases and then during the later half of the simulation it decreases gradually. For evaluating the performance of these protocols, we calculate the weighted error in the same way that we did for the static networks (e.g., Eqns. 3.1 and 3.2). For static networks, distance between every pair of sensors is fixed (time invariant). However, in the case of mobile sensors, as the sensors move around, the distance between a pair of sensors changes. By non-uniform information granularity, we intuitively mean that a sensor should have very precise information about its local neighborhood and loss in accuracy should be proportional to the distance between source and sink sensors. However, with mobile sensors, as the sensors move, the local neighborhood of a sensor changes as a function of time. It is thus interesting to study how the protocols (both deterministic and randomized) react to these neighborhood changes. We now analyze the performance of all the protocols when nodes are mobile with the revised data model. For mobility we consider the following two cases. In the first case we set the maximum speed of the sensors to 2 m/s (Figure 3.5) and in the second case to 10 m/s (Figure 3.6). The former model represents walking speeds (e.g., soldiers) while the later one represents vehicle speeds (e.g., tanks). The results presented here are the average of runs over 3 random topologies. The error calculation is done based on the distance between two nodes at the time of reading. Note that for the static network simulations we previously discussed, nodes chose a random number between 0 and 100 as their initial value and then during the reporting phase, each sensor incremented its reading 41
Average Error per 100 meters, frequency 5 packets/sec Mobility 2 m/s
Average Error per 100 meters, frequency 5 packets/sec Mobility 2 m/s
0.022
0.022 Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.02
0.018
Mean Absolute Error
Mean Absolute Error
0.018
0.016
0.014
0.012
0.01
0.016
0.014
0.012
0.01
0.008
0.008
0.006
0.006
0.004
Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.02
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
0.004
6
1
1.5
2
Distance (in multiple of 100 meters)
(a) Data rate 5 packets/sec. Average Error per 100 meters, frequency 1 packet/sec Mobility 2 m/s
4
4.5
5
5.5
6
Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.016
0.014
Mean Absolute Error
Mean Absolute Error
3.5
Average Error per 100 meters, frequency 1 packet/ 2 sec Mobility 2 m/s 0.018
Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.014
0.012
0.01
0.008
0.006
0.012
0.01
0.008
0.006
0.004
0.004
0.002
0.002
0
3
(b) Data rate 2 packets/sec.
0.018
0.016
2.5
Distance (in multiple of 100 meters)
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
0
6
Distance (in multiple of 100 meters)
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
Distance (in multiple of 100 meters)
(c) Data rate 1 packet/sec.
(d) Data rate 1 packet/ 2 sec.
Figure 3.5: Mobile sensors (speed 2 m/sec): Mean absolute error as a function of distance for different source data rates.
42
Average Error per 100 meters, frequency 5 packets/ sec Mobility 10 m/s
Average Error per 100 meters, frequency 2 packets/ sec Mobility 10 m/s
0.022
0.02 Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.02
0.016
Mean Absolute Error
Mean Absolute Error
0.018
0.016
0.014
0.012
0.01
0.014
0.012
0.01
0.008
0.008
0.006
0.006
0.004
0.004
Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.018
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
0.002
6
1
1.5
2
Distance (in multiple of 100 meters)
2.5
3
3.5
4
4.5
5
5.5
6
Distance (in multiple of 100 meters)
(a) Data rate 5 packets/sec.
(b) Data rate 2 packets/sec.
Average Error per 100 meters, frequency 1 packet/ sec Mobility 10 m/s
Average Error per 100 meters, frequency 1 packet/ 2 sec Mobility 10 m/s
0.018
0.016 Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.016
Flooding Filtercast RFiltercast Unbiased (p = 0.5) Biased
0.014
0.014
Mean Absolute Error
Mean Absolute Error
0.012 0.012
0.01
0.008
0.006
0.01
0.008
0.006
0.004 0.004 0.002
0.002
0
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
0
6
Distance (in multiple of 100 meters)
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
Distance (in multiple of 100 meters)
(c) Data rate 1 packet/sec.
(d) Data rate 1 packet/ 2 sec.
Figure 3.6: Mobile sensors (speed 10 m/sec): Mean absolute error as a function of distance for different source data rates.
43
−3
Weighted mean error versus Energy study (Mobility 2 m/s)
x 10
Weighted mean error at data rates (5,2,1,.5 packets/sec)
Weighted mean error at data rates (5,2,1,.5 packets/sec)
−3
5
Flooding Filtercast RFiltercast Unbiased (p = 1/2) Biased
4.5
4
3.5
3
2.5
2
1.5
2
4
6
8
10
12
14
16
Energy Spent (J)
5.5
Weighted mean error versus Energy study Mobility 10 m/s
x 10
Flooding Filtercast RFiltercast Unbiased (p = 1/2) Biased
5
4.5
4
3.5
3
2.5
2
1.5
4
6
8
10
12
14
16
Energy Spent (J)
(a) speed 2 m/sec.
(b) speed 10 m/sec.
Figure 3.7: Mobile sensors: Weighted error-energy tradeoff.
by a fixed amount (10 each second) at fixed intervals. In the revised data model, the variations in the sensor readings are not so high. Thus, the results with the revised data model and the initial data model are not comparable numerically per se. However, one can clearly see the same trend in the relative performance of the different protocols2 Figures (3.5(a) and 3.5(b)) and (3.6(a), 3.6(b), and 3.6(c)) show that RFiltercast and both the randomized protocols perform better than or very close to that of flooding while the error value for Filtercast is high. Similar to its performance in static networks, for low data rates, flooding starts performing better in terms of accuracy than all the other protocols. Figure 3.7 shows the trade-off between energy and weighted error, using the weighted error calculation method described in Eqns. 3.1 and 3.2 for both mobility cases. In these figures, the X-axis indicates the energy spent in Joules and the Y-axis shows mean weighted error. The trends observed are very similar to that of the static network (Figure 3.2). At high data rates, flooding spends maximum energy and has the highest error. Note that for higher mobility (Figure 3.7(b)), the performance of flooding is worse than that of the low mobility case (Figure 3.7(a)). In the case of high mobility, the performance of RFiltercast and the biased randomized protocol is better than flooding except for the lowest data rate. These results indicate that the protocols are resilient to mobility, as changes in speed have almost no impact on the error values of the protocols for more a realistic data model.
3.5
Discussion
Overall from these results, we can conclude the following: in the case of applications that can exploit non-uniform information, protocols can be designed to make efficient use of the available 2
We conjecture that the reason for change in the shape of error curve in the case of mobile sensors, can be attributed to the revised data model.
44
bandwidth while providing the necessary level of accuracy. Generally, RFiltercast outperforms Filtercast when the network is not congested. Also, naive, randomized protocols such as the unbiased protocol, outperform specialized protocols such as Filtercast. This is because in general these protocols forward messages more aggressively with the parameters we selected. In our setting, since accuracy is a function of distance, the errors for far sensors count less and thus overall these protocols perform well. The biased randomized protocol has comparable performance to that of RFiltercast. In the simulations presented here, the biased protocol performs better in terms of accuracy than the unbiased protocol, even for distant sink nodes. This effect is simply due to the forwarding probability settings and, most importantly, the size of the simulated networks, which limits the maximum number of hops between the source and destination. As mentioned previously, in the biased protocol sensors transmit packets from nearby sensors with high probability, and this probability decreases linearly as a function of the number of hops between the source and the sensor transmitting the packet (see section 3.3.3). However, in the case of the unbiased protocol, for any packet that needs to be forwarded, the forwarding probability is constant (0.5 in our simulations). In our simulations, the maximum number of hops is limited to six, since we could not run larger simulations due to computational resource constraints. Therefore, for the simulations presented here, the biased protocol has higher forwarding probability than the unbiased protocol for the first few hops (with respect to the source), which dominates the picture. We conjecture that if we increase the number of hops (to, say, 20), then for distant sinks, the unbiased protocol will perform better than the biased protocol in terms of accuracy. Note that both Filtercast and RFiltercast have some overhead to maintain the source lists and the count of how many packets a given source node has transmitted. On the other hand, the randomized protocols do not require such state to be maintained. We believe that randomized protocols with intelligent adjustments of forwarding probabilities can be considered as the most efficient alternative for non-uniform data dissemination.
3.6
Related Work
Recently, sensor networks have drawn a considerable amount of attention from the research community and several groups have proposed architectures for sensor nodes [29, 43, 63]. Kahn et.al [104], described research challenges posed by the “smart dust” technology to mobile networking and other related systems area. This paper initiated much research in the field in terms of spurring the development of new protocols and exploring other systems level issues in sensor networks. Networking and data dissemination issues have also received considerable interest. The data-centric nature of sensor networks has allowed research to explore alternatives to traditional protocols [85, 95]. Most existing work focuses on two primary sensor network information dissemination models: (1) Sensors send their data toward a central base station that has infinite power and is responsible for all data processing, and no in-network processing is done. (2) Sensors do some in-network data processing such as data fusion and this high level data is sent to the central base station. A number of such approaches have been proposed (e.g., [87, 97, 122]). However, in our case, we do not assume the presence of any such base station, and the sensors disseminate information among themselves so that the user can connect to any of the sensors to extract network information. Other studies considered specific sensor network applications and their implication on protocol 45
design. Cerpa et al. [42], have considered habitat monitoring and have designed protocols to match the application need. Heinzelman et al. [88], described adaptive protocols for information dissemination. In this work, to save energy, sensors send out advertisements for data they have, and they only send the actual data if it is requested by one or more nodes. In previous work [191], we described probabilistic flooding alternatives. However, the main goal of that work was congestion avoidance rather than non-uniform information dissemination. Li et al. [120] proposed a gossip-based approach for routing protocols to reduce routing overhead. However, the study focused only on routing messages (with implicit uniform information granularity requirement). Recently, Barette et al. [32] proposed a family of gossip-based routing protocols for sensor networks. In their study they considered various parameters such as the number of hops between the source and the destination, the number of hops the packet has traveled, etc. In the DREAM [170] routing protocol, routing tables are updated based on the distance between two nodes and the mobility rate of a given node. While this work has a similar flavor to our work, exploiting non-uniform information needs, it is limited to only adjusting routing tables and does not apply to the actual data that is exchanged between two nodes. Kempe et al. [52] presented theoretical results for gossiping protocols with resource location as a motivating problem and delay as the primary consideration. In their setting, a node at distance d from the origin of a new information source should learn about it with a delay that grows slowly with d and independent of network size. They do not consider application level performance criteria such as accuracy, which is part of our study. In this chapter we have considered flooding as one of the alternatives for data dissemination in sensor networks. However, flooding and its alternatives have also been explored in the context of mobile ad hoc networks. Perkins et al. describe IP Flooding in ad-hoc networks [93]. While this chapter considers probabilistic flooding protocols for sensor networks, Sasson et al. have studied probabilistic flooding for ad hoc networks [217] and used the phase transition phenomenon as a basis to select the broadcasting probability. Williams et al. [210] described and compared several broadcasting protocols (including probabilistic protocols) in the context of mobile ad hoc networks. Note that the forwarding probability in these works is not sensitive to the context of the data (its distance from the source); this is the major difference between these protocols and ours. In summary, the primary difference between our work and existing work is the use of application knowledge as context to moderate the forwarding of the data and to achieve information dissemination with non-uniform granularity. In our study, we focus on a new application requirement, non-uniform information dissemination, and we analyze protocols for this class of applications.
3.7
Conclusions and Future Work
In this chapter we considered sensor network applications where events need to be disseminated to observers that may be present anywhere in the sensor field. For such applications, simply flooding all the data is extremely wasteful. Therefore, we defined the idea of non-uniform information dissemination to capitalize on the fact that the value of data is typically highest for observers that are closest to the source of the data. We developed and analyzed several protocols to accomplish non-uniform dissemination, both deterministic (Filtercast and RFiltercast) and non-deterministic (unbiased and biased) protocols, and we evaluated them under various traffic loads and transmission ranges. In all cases, the developed protocols were clearly superior to simple flooding, both from an application and a network perspective. With flooding, congestion appears to be a limit46
ing constraint and further, flooding is not generally energy-efficient. Our results indicate that the performance of RFiltercast and the biased randomized protocol is almost equivalent. RFiltercast requires each sensor to maintain some extra state information, whereas the biased randomized protocol is completely stateless and has negligible overhead. Also, we note that the performance of Filtercast and the unbiased randomized protocol is almost equivalent. We also showed that RFiltercast as well as the randomized protocols are resilient to mobility. While in this chapter the distance between two nodes is used as the context parameter for nonuniform data dissemination, in our future work we will focus on a broad range of applications with a non-uniform information dissemination requirement, where factors other than distance, such as importance of the information and confidence in the generated data, can be used. Also, we would like to develop protocols that will tune the forwarding probabilities dynamically depending upon factors such as traffic load, network connectivity, resources (remaining battery power), within the holistic framework as a balance between the utility of forwarding the data, vs. the cost of doing so. We would also like to develop a priority-based protocol where a source marks all its outgoing packets with a certain priority to indicate the importance of the information contained in the given packet. Any forwarding node can consider the priority of the packet when making its forwarding decision. These techniques will extend the applicability of non-uniform information dissemination to new classes of applications for wireless sensor networks.
47
Chapter 4 Collaborative Storage Management 4.1
Introduction
In this study, we consider a class of sensor networks where the information collected by the sensors is not collected in real-time. In such applications, the data must be stored, at least temporarily, within the network and used in response to dynamic queries, until it is later collected by an observer, or until it ceases to be useful. For example, data may be collected for a scientific application in a sensor field. Scientists appear occasionally (e.g., every week) to check on the experiment and collect the data. Furthermore, some applications have sensors that collect data that may be needed by users of the networks that generate queries dynamically. In such applications, the data must be stored in the network: storage is a primary resource which, in addition to energy, determines the useful lifetime of the network. This is the problem considered in this research: how to use limited persistent storage of a sensor to store sampled data effectively. One basic storage management approach is to buffer the data locally at the sensors that collect them. However, such an approach does not allow the neighbors to collaborate to reduce the size of their overall data. Two approaches for reducing the data size are possible if neighbors collaborate: (1) Aggregation: they can capitalize on the spatial correlation of data among neighboring sensors to reduce the overall size of the stored data. Aggregation is well known in the context of traditional real-time monitoring sensors [34, 96]. However, there are important differences in this case: aggregation relies on the presence of data from multiple sensors at intermediate nodes as it is being sent in real-time towards a sink. With local buffering this is not possible without forcing data exchange among sensors. There are other important differences from traditional aggregation such as the long-term availability of the data and the ability to revisit aggregation decisions since storage is reassignable; and (2) Coordination for Redundancy control: More specifically, with local buffering the sensors may be collecting and storing redundant data. By coordinating infrequently, redundancy can be estimated allowing the sensors to sample less aggressively and save storage and energy. Collaborative storage management can provide the following advantages over a simple buffering technique: (1) More efficient storage allows the network to continue storing data for a longer time without exhausting storage space; (2) Load balancing is possible: if the rate of data generation is not uniform at the sensors (e.g., in the case where a localized event causes neighboring sensors to collect data more aggressively), some sensors may run out of storage space while space remains available at others. In such a case, it is important for the sensors to collaborate to achieve load 48
balancing for storage to avoid or delay data loss due to insufficient local storage; and (3) Coordination among nearby sensors allows tracking of context related to the redundancy in the reported data. This allows dynamic, localized reconfiguration of the network (such as adjusting sampling frequencies of sensors based on estimated data redundancy and current resources). We describe a cluster-based collaborative storage approach and compare it through simulations to a local buffering technique. Our experiments show that collaborative storage makes more efficient use of sensor storage and provides load balancing, especially if a high level of spatial correlation/redundancy among the data of neighboring sensors is present. The trade-off is that using collaborative storage, data need to be communicated among neighboring nodes, and thus collaborative storage expends more energy than local buffering in the data collection phase. However, since data is aggregated using collaborative storage, a smaller amount of data is stored and a smaller amount of data is eventually relayed to the observer, thereby reducing energy dissipation in this phase of operation. In addition to collaborative storage, we explore the use of context for optimizing data storage operation. Specifically, we measure the available redundancy and feed it back to the sensors to enable them to adjust the sampling rate to match the required application fidelity. Since an individual sensor has a limited view, its estimate about utility of its data might diverge significantly than its actual utility. We show how context can be used to bridge that gap either by providing feedback from the cluster head to its members or by exchanging information directly among neighboring sensors. The remainder of this chapter is organized as follows. Section 4.2 overviews the partitioned sensor network problem and motivates collaborative storage in more detail in the context of this problem. Section 4.3 discusses the design goals of a storage management scheme and the general energy-storage tradeoff involved in collaborative storage. Section 4.4 provides an overview of related work in this area. Section 4.5 presents the proposed storage management protocols and discusses the important design tradeoffs. In section 4.6 we evaluate the storage alternatives under different scenarios. Finally Section 4.7 presents conclusions and our future research.
4.2
Motivation
In this section, we motivate the storage problem by describing applications that require innetwork storage. We identify the following two classes of applications: 1. Offline Monitoring: the sensors are deployed to collect detailed information about a phenomenon for later playback and analysis. Eventual data collection (reach-back) can be accomplished by an observer who moves around the sensor field or relayed back through multi-hop communication among the sensors themselves. In either case, the data is stored continuously, but only read once. ZebraNet [103] is an example of off-line monitoring: it is a sensor network for wild-life tracking whose goal is to monitor migration patterns, social structures and mobility models of various animal species. In this application, sensors are attached to animals. Scientists (aka observers) collect the data by driving around the monitored habitat receiving information from Zebras as they come in range with them. Data collection is not preplanned: it might be unpredictable and infrequent. The sensors do not have an estimate regarding the observer’s schedule. The observer would like the network to maintain all the new data samples available 49
since the last time the data was collected. Further, we would like the collection time to be small since the observer may not be in range with the zebra for a long time. 2. Dynamic Augmented Reality: in this type of application, the sensors store data about ongoing events. The data is dynamically queried by users within the network. The queried data can be data about current and recent event or even historical data. For example, such a network deployed in a battlefield may be queried by soldiers to learn about nearby enemy units (current and recent data) or by commanders to learn about long-term enemy movement. In such a network, collected data may be accessed multiple times (or not at all). Moreover, the importance of a piece of data may change with time. Storage may also be used to tolerate temporary network partitioning, where the observer is not reachable from the partitioned sensors, without losing potentially valuable data. The Remote Ecological Micro-Sensor Network [161] aimed at remote visual surveillance of federally listed rare and endangered plants. This project aims to provide near-real time monitoring of important events such as visitation by pollinators and consumption by herbivores along with monitoring a number of weather conditions and events. Sensors are placed in different habitats, ranging from scattered low shrubs to dense tropical forests. Environmental conditions can be severe; e.g., some locations frequently freeze. In this application, network partitioning (relay nodes becoming unavailable) may occur due to the extreme physical conditions (e.g., deep freeze). Moreover, voluntary partitioning may occur if relay nodes operate in low duty cycle to conserve energy, to reduce interference with the observed phenomenon. Important events that occur during disconnection periods should be recorded and reported once the connection is reestablished. Effective storage management is needed to maximize the partitioning time that can be tolerated. In the next section, we discuss various factors that affect the design of a storage management scheme and then describe the design goals of such a system.
4.3
Energy-Storage Tradeoff Space
A majority of sensor network research focuses on making sensor network design and operation energy efficient. However, for storage bound applications, there is an additional finite resource: the available storage at the sensors. Once the available storage space is exhausted, a sensor can no longer collect and store data locally; unless they delete older data, the sensors that have run out of storage space cease to be useful. Thus, the sensor network utility is bound by two resources: its available energy and its available storage space. Effective storage management protocols must balance these two resources to prolong the network’s useful lifetime. Energy and storage are fundamentally different resources. Specifically, storage is a re-assignable resource, while energy is not. For example, a node may free up some storage space by deleting or compressing data. This is not possible in the case of energy, as the battery power spent in either transmitting or receiving data can not be reassigned to new data. Additionally, storage at other nodes may be utilized, at the cost of transmitting the data to them. The alternative to storing the data locally is transmitting the data towards an observer or a collection point. In existing technology, storage devices consume significantly less energy than RF communication devices. Accordingly, the tradeoff between storage and energy is complex. Sensors may exchange their data with nearby sensors to take advantage of the spatial correlation in their data to reduce the overall data size. Another positive side effect is that the storage load can be balanced even if the data generation rates or the storage resources are not. However, the exchange of data 50
among the sensors consumes more energy in the data collection phase: the energy cost of communicating the data far outweighs the energy savings in storage (due to the smaller data size). On the surface, it may appear that locally storing data is the most energy efficient solution. However, the extra energy spent in exchanging data may be counterbalanced by the energy saved by storing smaller amounts of data and, more importantly, by the smaller energy expenditure when replying to queries or relaying the data back to observers. In section 2.10 we described canonical design goals for sensor network protocols. In this section, we propose following additional design goals for a sensor storage management protocol. • Storage efficiency: Since the amount of storage available to a sensor is very limited, it is important to minimize the data that needs to be stored. Efficient use of storage space leads to better coverage since a given sensor can continue to store data for longer time period. • Storage Load Balancing: If the available storage resources or the data generation rates are nonuniform, it is desirable that the storage be load balanced to avoid exhausting the storage at important sensors and losing their data. • Data Coverage: this is a measure of the fraction of the data samples that the network was able to collect and retain. • Energy-efficiency: Sensors are constrained by limited battery power available to them. Any storage management scheme should be designed with the goal of energy efficiency in mind. Energy is spent in two phases: (1) during data collection: this is the energy spent to store the data in addition to any communication that occurs to take advantage of collaborative storage; and (2) during data access: this is the energy spent in relaying the data to one or more data sinks. In the offline monitoring model, data access occurs once. In the augmented reality model, it may be accessed any number of times.
4.4
Related Work
Because of the wireless nature of sensors, the primary resource constraint is the limited battery energy available. Energy-awareness permeates all aspects of sensor design and operation, from the physical design of the sensor [29, 43] to the design of its operating system [91], communication protocols and applications [214]. In this section, we briefly overview some of the issues involved in storage management; for a more detailed review of these issues, please refer to our survey on this topic [194]. Imielinski and Goel propose DataSpaces: a model of sensor networks where sensors permeate the physical world collecting and storing data locally and queries to this data are geographically based [95]. DataSpaces can benefit from collaborative storage to reduce the size of the stored data and to load balance storage without migrating data significantly from its source. Ratnasamy et al. propose using Data Centric Storage (DCS) to store data by name within a sensor network such that all related data is stored at the same (or nearby) sensor nodes using geographic hashing [156]. GHT is a structured approach to sensor network storage that makes it possible to index data based on content without requiring query flooding. GHT also provides load-balancing of storage usage (assuming fairly uniform sensor deployment). GHT implements a Distributed Hash Table by hashing a key k into geographic coordinates. Thus, queries for data 51
of a certain type are likely to be satisfied by a small number of nodes, significantly improving the performance of queries. However, this enhanced query performance requires moving related data from its point of generation to its appropriate keeper as determined by geographic hashing. We view this work as a higher level management of data focusing on optimizing queries rather than storage: our approach could compliment DCS by providing more effective storage of the data as it is collected. GHT targets retrieval of high-level, precisely defined events. The original GHT implementation is limited to report whether a specific high-level event occurred. However, it is not able to efficiently locate data in response to more complex queries. The Distributed Index for Features in Sensor Networks (DIFS) attempts to efficiently support range queries. Range queries are the queries where only events within a certain range are desired. In DIFS, the authors propose a distributed index that provides low average search and storage communication requirements while balancing the load across the participating nodes [78]. Concurrently with us [195], Ganesan et al. have explored protocols for storage constrained sensor networks [76]. They consider a similar problem to ours and explore some of the solution space we are considering, with some important differences. More specifically, our work differs in the following ways: (1) We explore additional approaches to storage management, including those using coordination for redundancy control; (2) we explore issues that arise due to uneven data generation (e.g., due to event driven or adaptive sampling applications) and non-uniform storage distribution (e.g., due to non-uniform deployment of the sensors). In such applications, effective load balancing is required; and (3) we study some additional characteristics of the storage protocols including coverage and collection time and energy. Conversely, Ganesan et al. consider some aspects of the problem that we do not examine in detail. For example, in storage-constrained networks, one of the issues is how the algorithm behaves when storage is limiting. The proposed approach is to use multi-resolution storage – adaptively reducing the resolution of the stored data based on its importance. They explore a policy for multi-resolution storage based on the age of the data. They proposed and evaluated several novel aging strategies (reduction of resolution based on age).
4.5
Storage Management Protocols
A primary objective of storage management protocols is to efficiently utilize the available storage space to continue collecting data for the longest possible time without losing samples in an energy efficient way.
4.5.1 Storage Approaches Storage management approaches can be classified as: 1. Local storage: This is the simplest solution where every sensor stores its data locally. This protocol is energy efficient during the storage phase since it requires no data communication. Even though the storage energy is high (due to all the data being stored), the current state of technology is such that storage costs less than communication. However, this protocol is storage inefficient since the data is not aggregated and redundant data is stored among neighboring nodes. Local storage is unable to load balance if data generation or the available storage varies across sensors. 52
2. Collaborative storage: Collaborative storage refers to any approach where nodes collaborate. This includes cooperation to estimate local redundancy as well exchange of data for aggregation as well as load balancing. Collaboration leads to two benefits: (1) Less data is stored: measurements obtained from nearby sensors are typically correlated. This allows data samples from neighboring sensors to be aggregated; and (2) Load balancing: collaboration among sensors allows them to load balance storage. It is important to consider the energy implications of collaborative storage relative to local storage. Collaborative storage requires sensors to exchange data, causing them to expend energy during the storage phase. However, because they are able to aggregate data, the energy expended in storing this data to a storage device is reduced. In addition, once connectivity with the observer is established, less energy is needed during the collection stage to relay the stored data to the observer. We note that this holds true even if in-network aggregation is carried out for locally buffered data during the reach-back stage due to the following two reasons: (1) Initial communication (first hop) of the locally buffered data will not be aggregated; and (2) Less efficient aggregation: a smaller amount of time and resources are available when near real-time data aggregation is applied during reach-back as compared to aggregation during the storage phase. Aggregating data during reachback is limited because all the data collected during the storage phase is compressed in a short time. In the remainder of this section, we first discuss the use of collaboration for data aggregation and then for redundancy control.
4.5.2 Collaborative Storage Protocols for Data Aggregation One use of collaboration is to take advantage of data aggregation. Aggregation in the application domain we are considering differs from that in traditional sensor networks because of the nonreal-time nature. More specifically, in traditional applications aggregation is carried out using a snapshot of the available data. Data cannot be delayed because it is assumed that an observer is interested in continuously monitoring the data. In the applications we are considering, the data is held locally for extended period, allowing more effective “wide-angle” aggregation to be carried out. Aggregation is highly application dependent. Consider a tracking application where nearby sensors exchange a local estimate of distance to a phenomena. Once this information is available at a single node, it may be triangulated into a single location estimate. In another application, multiple samples from nearby sensors are beamformed to produce a single high quality sample. In order to develop the general storage tradeoff, we abstract away the details of the aggregation model and consider only the resulting data size reduction. Primarily, we model aggregation as compression of the collected data samples and vary the compression ratio. This model is not representative of all applications (e.g., beam forming); we discuss the effect of alternative aggregation models later. A number of organizations are possible for collaborative storage. Data is most correlated and redundant among nearby sensors. Moreover, to minimize data exchange cost, we restrict data exchange to be among neighboring sensors (recall that the cost of communication is extremely high compared to storage). Finally, data must be collected at a single location for aggregation. As a result of these three factors a cluster based model suggests itself. Clustering has been widely studied in sensor and ad hoc networks; the specific clustering algorithm used is not important – virtually any existing clustering algorithm can be used. In the remainder, we briefly describe the features of the Cluster Based Collaborative Storage (CBCS) protocol used in our evaluation study. CBCS uses collaboration to take advantage of data aggregation. 53
In CBCS, clusters are formed in a distributed connectivity-based or geographically-based fashion. Each sensor sends its observations to the elected Cluster Head (CH) periodically. The CH then aggregates the observations and stores the aggregated data. Only the CH needs to store aggregated data, thereby resulting in low storage. The clusters are rotated periodically to balance the storage load and energy usage. Note that only the CH needs to keep its radio on during its tenure, while a cluster member can turn off its radio except when it has data to send. This results in high energy efficiency: idle power consumes significant energy in the long run if radios are kept on. The reception of unneeded packets while the radio is on also consumes energy. Operation during CBCS can be viewed as a continuous sequence of rounds until an observer/base station is present and the reach-back stage can begin. Each round consists of two phases: (1) CH Election phase: Cluster election is the process of identifying cluster heads that serve as control points and data sinks for their cluster members. In this phase, each sensor advertises its resources to its one hop neighbors. Based on this resource information a cluster head (CH) is selected The remaining nodes then attach themselves to that CH during the data transfer phase; and (2) Data exchange phase: If a node is connected to a CH, it sends its observations to the CH; otherwise, it stores its observations locally. The CH election approach used in CBCS is based on the characteristics of the sensor nodes such as available storage, available energy or proximity to the “expected” observer location. The criteria for CH selection can be arbitrarily complex; in our experiments we used available storage as the criteria. We borrow the idea of cluster head rotation for load balancing from the LEACH protocol [86]. CH rotation is done by repeating the cluster election phases with every round. The frequency of cluster rotation influences the performance of the protocol. Depending on the cluster formation criteria, there is an overhead for cluster formation due to the exchange of messages. Also, note that there is a tradeoff in the cluster rotation frequency: we must balance the need to frequently do cluster rotation to achieve fine grained load balancing of energy and storage resources against the overhead of the cluster formation. The cluster election approach above may result in a situation where a node A, selects a neighbor B to be its CH when B itself selects C (which is out of range with A) to be its own CH. This may result in chains of cluster heads leading to ineffective/multi-hop clustering. To eliminate the above problem and restrict clusters to one hop, geographical zoning is used: an idea that is similar to the approach of constructing virtual grids [212]. More specifically, the sensor field is divided into zones such that all nodes within a zone are in range with each other. Cluster selection is then localized to a zone such that a node only considers cluster advertisements occurring in its zone. Only one CH is selected per zone, eliminating CH chaining as discussed above. We note that this approach requires either pre-configuration of the sensors or the presence of a location discovery mechanism (GPS cards or a distributed localization algorithm [37]). In sensor networks, localization is of fundamental importance as the physical context of the reporting sensors must be known in order to interpret the data. We therefore argue that our assumption that sensors know their physical co-ordinates is realistic. In any case, we emphasize that cluster formation is orthogonal to collaborative storage and other cluster formation approaches can be used.
4.5.3 Coordination for Redundancy Control One idea we explore is coordination among the sensors. Specifically, each sensor has a local view of the phenomenon, but cannot assess the importance of its information given that other sensors may report correlated information. For example, in an application where 3 sensors are sufficient to 54
triangulate a phenomenon, 10 sensors may be in a position to do so and be storing this information locally or sending it to the cluster head for collaborative storage. Through coordination, the cluster head can inform the nodes about the degree of the redundancy allowing the sensors to alternate triangulating the phenomenon. In this case a sensor’s local estimate of its data utility diverges from the its actual utility from global perspective. Since the cluster head has a broader view of the network, it can provide this context to its members in the form of feedback. Note that, there is some overhead in building the context, namely the communication between CH and individual sensors. At the same time, significant energy saving can be achieved if this context information is fed to the sensors because it is likely that this feedback is going to improve operation for an extended period of time. We believe that the protocols that consider the cost of providing context versus the benefit that can be obtained from it would lead to more efficient sensor network operations. One way to make the feedback decision could be a threshold based mechanism. For example, a CH can determine the divergence between local and global estimate and if the divergence is greater than some threshold (say T ), then it provides the feedback, otherwise it takes no action. Further details on how an application can make such decision can be found in chapter 6. Coordination can be carried out periodically at low frequency, with a small overhead (e.g., with CH election). Similar to CH election, the nodes exchange meta data describing their reporting behavior and we assume that some application specific estimate of redundancy is performed to adjust the sampling rate. Coordination can be used in conjunction with local storage or collaborative aggregated storage. In Coordinated Local Storage (CLS), the sensors coordinate periodically and adjust their sampling schedules to reduce the overall redundancy, thus reducing the amount of data that will be stored. Note that the sensors continue to store their readings locally. Relative to Local Storage (LS), CLS results in a smaller overall storage requirements and savings in energy in storing the data. This also results in a smaller and more energy efficient data collection phase. Similarly, Coordinated Collaborative Storage (CCS) uses coordination to adjust the sampling rate locally. Similar to CBCS, the data is still sent to the cluster head where aggregation is applied. However, as a result of coordination, a sensor can adapt its sampling frequency/ data resolution to match the application requirements. In this case, the energy in sending the data to the cluster head is reduced because of the smaller size of the generated data, but the overall size of the data is not reduced. We evaluate CLS and CCS compared to the non-coordinated counterparts, LS and CBCS.
4.6
Experimental Evaluation
We simulated the proposed collaborative storage protocols using the NS-2 simulator [140]. We use a CSMA based MAC layer protocol. A sensor field of 350 × 350 meters2 is used with each sensor having a transmission range of 100 meters. We considered three levels of sensor density: 50 sensors, 100 sensors and 150 sensors deployed randomly. We divide the field into 25 zones (each zone is 70 × 70 meters2 to ensure that any sensor in the zone is in range with any other sensor). The simulation time for each scenario was set to 500 seconds and each point represents an average over five different topologies. Cluster rotation and coordination are performed every 100 seconds in the appropriate protocols. We assume sensors have a constant sampling rate (set to one sample per second). For the coordinated redundancy control protocols, we used a scenario where the available redundancy was 55
Storage space as a function of Network Size
5
5
x 10
Local−Buffer CLS CBCS CCS
Mean Storage Space Consumed (Bytes)
4.5
4
3.5
3
2.5
2
1.5
1
0.5
0
50
100
150
Number of Sensors
Figure 4.1: Storage space vs. Network Density
on average 30% of the data size – this is the percentage of the data that can be eliminated using coordination. We note that this reduction in the data size represents a portion of the reduction possible using aggregation. With aggregation the full data is available at the cluster head and can be compressed at a higher efficiency. Several sensor nodes that are appearing on the market, including Berkeley MICA nodes [131] have Flash memories. Flash memories have excellent power dissipation properties and small form factor. As a representative we consider a SimpleTech flash memory USB cards [179] and table 2.1 shows its energy characteristics. In current wireless communication technologies (Radio Frequency based), the cost of communication is high compared to the cost of storage (Table 2.3 shows its energy characteristics). From these tables, we can see that representative radios following the Zigbee IEEE 802.15.4 standard consume energy at roughly 30-40 times the cost of the SimpleTech USB card above per unit data. Our energy models in the simulation are based on these two devices. Note that the possible data aggregation/compression as well as the reduction due to redundancy control are application as well as topology dependent. Consider a temperature sensing application. For this application a given sensor can collect data from all its neighbors and then simply take the average and store a single value (or maybe minimum, mean and maximum values) as representative. However, if the sensors are sending video data, then such high spatial compression might not be possible. In this chapter, instead of considering a specific application, we assume a data aggregation model where the cluster head is able to compress the size of the data by an aggregation ratio α. By controlling α we can consider different applications with different levels of available spatial correlation. While this model is useful in exposing the tradeoff space for collaborative storage, it is not representative of all applications. More specifically, the size of the aggregated data grows linearly with the number of available sensors, rather than as a function of the phenomenon as should be the case under effective monitoring. We consider the implications of this model on collaborative storage and explore other possible models later in this section.
56
4.6.1 Storage and Energy Tradeoffs Figure 4.1 shows the average storage used per sensor as a function of the number of sensors (50, 100 and 150 sensors) for the four storage management techniques : (1) local storage (LS); (2) Cluster-Based Collaborative Storage (CBCS); (3) Coordinated Local Storage (CLS); and (4) Coordinated Collaborative Storage (CCS). In the case of CBCS aggregation ratio was set to 0.5. The storage space consumption is independent of the density for LS and is greater than storage space consumption than CBCS and CCS (roughly in proportion to the aggregation ratio). CLS storage requirement is in between the two approaches because it is able to reduce the storage requirement using coordination (we assumed that coordination yields improvement uniformly distributed between 20% and 40%). Note that after data exchange, the storage requirement for CBCS and CCS are roughly the same since aggregation at the cluster head can reduce the data to a minimum size, regardless of whether coordination took place or not. Surprisingly, in the case of collaborative storage, the storage space consumption decreases slightly as the density increases. While this is counter-intuitive, it is due to higher packet loss observed during the exchange phase as the density increases; as density increases, the probability of collisions increases. These losses are due to the use of a contention based unreliable MAC layer protocol: when a node wants to transmit its data to the CH. The negligible difference in the storage space consumption between CBCS and CCS is also an artifact slight difference in the number of collisions observed in the two protocols. The use of a reliable protocol such as that in IEEE 802.11 or a reservation based protocol such as the TDMA based protocol employed by LEACH [86] can be used to reduce or eliminate losses due to collisions (at an increased communication cost). Regardless of the effect of collisions, one can clearly see that the collaborative storage achieves significant savings in storage space compared to local storage protocols (in proportion to the aggregation ratio). Figure 4.2(a) shows the consumed energy for the protocols in Joules as a function of network density. The X-axis represents protocols for different network densities: L and C stand for local buffering and CBCS respectively. L-1,L-2,and L-3 represents the results with local buffering technique for network size 50,100 and 150 respectively. The energy bars are broken into two parts: pre-energy, which is the energy consumed during the storage phase, and post-energy, which is the energy consumed during data collection (the relaying of the data to the observer). The energy consumed during storage phase is higher for collaborative storage because of the data communication among neighboring nodes (not present in local storage) and due to the overhead for cluster rotation. CCS spends less energy than CBCS due to reduction in data size that results from coordination. However, CLS has higher expenditure than LS since it requires costly communication for coordination. This cost grows with the density of the network because our coordination implementation has each node broadcasting its update and receiving updates from all other nodes. For the storage and communication technologies used, the cost of communication dominates that of storage. As a result, the cost of the additional communication during collaborative storage might not be recovered by the reduced energy needed for storage except at very high compression ratios. This tradeoff is a function of the ratio of communication cost to storage cost; if this ratio goes down in the future (for example, due to the use of infra-red communication or ultra-low power RF radios), collaborative storage becomes more energy efficient compared to local storage. Conversely, if the ratio goes up, collaborative storage becomes less efficient. The data collection model depends on the application and network organization; several models are in use for deployed sensor networks. We use a simple collection model where we only account for the cost of transferring the data one hop. This model is representative of an observer that moves around and gather data from the sensors. Also, in cases where the local buffering approach carries 57
Energy Consumption study as a function of Network Size
Collection Time as a function of Network Size
0.9
700
Local−Buffer CLS CBCS CCS
Pre−Energy Post−Energy 0.8 600 0.7
Mean CollectionTime
500
Mean Energy (J)
0.6
0.5
0.4
0.3
400
300
200 0.2 100 0.1
0
0
L−1 CLS−1 C−1 CCS−1 L−2 CLS−2 C−2 CCS−2 L−3 CLS−3 C−3 CCS−3
Number of Sensors
50
100
150
Number of Sensors
(a) Energy Consumption vs. Density
(b) Mean Collection Time vs. Density
Figure 4.2: Energy Consumption and Collection Time Study
out aggregation at the first hop towards the observer, the size of the data becomes similar in the two approaches and the remainder of the collection cost is the same. However, this is optimistic in favor of local storage because near real-time data aggregation will not in general be able to achieve the same aggregation level during collection as is achieved during collaborative storage. This is due to the fact that collaborative storage can afford to wait for samples and compress them efficiently. Moreover, in collaborative storage, the aggregation is done incrementally over time, requiring fewer resources than aggregation during collection where large amounts of data are processed during a short time period. The collaborative storage approaches outperform the local storage ones according to this metric due to their smaller storage size. CLS outperforms LS for the same reason. Finally, we assumed a reach-back model (eventual data collection) where the data is read once at the end consistent with offline monitoring applications. In case of dynamic applications, data may be accessed multiple times by different observers. In this case, the energy saving in the post phase will be further in favor of collaborative storage because the reduced data size end up benefitting multiple queries. Figure 4.2(b) shows that with collaborative storage, the collection time is considerably lower than that of local buffering. In addition, CLS outperforms LS. Low collection time and energy are important parameters from a practical standpoint. After exploring the effect of coordination, the remainder of the chapter presents results only with the two uncoordinated protocols (LS and CBCS).
4.6.2 Storage Balancing Effect In this study, we explore the load-balancing effect of collaborative storage. More specifically, the sensors are started with a limited storage space and the time until this space is exhausted is tracked. We consider an application where a subset of the sensors generates data at twice the rate of the others, for example, in response to higher observed activity close to some of the sensors. To model 58
Percentage of Storage Depleted Sensors versus time 100 Local−Even Collaborative−Even Local−Uneven Collaborative−Uneven
Percentage of Sensors without Storage
90
80
70
60
50
40
30
20
10
0
1
2
3
4
5
6
7
8
9
Time (in multiple of 100 seconds)
Figure 4.3: Percentage of Storage Depleted Sensors vs. Time.
the data correlation, we assume that sensors within a zone have correlated data. Therefore all the sensors within a zone will report their readings with the same frequency. We randomly select zones with high activity.; sensors within those zones will report twice as often as those sensors within low activity zone. In Figure 4.3, the X-axis denotes time (in multiples of 100 seconds), whereas the Y-axis denotes the percentage of sensors that have no storage space left. Using LS, in the even data generation case, all sensors run out of storage space at the same time and all data collected after that is lost. In comparison, CBCS provides longer time without running out of storage because of its more efficient storage. The uneven data generation case highlights the load-balancing capability of CBCS. Using LS, the sensors that generate data at a high rate exhaust their storage quickly; we observe two subsets of sensors getting their storage exhausted at two different times. In comparison, CBCS has much longer mean sensor storage depletion time due to its load balancing properties, with sensors exhausting their resources gradually, extending the network lifetime much longer than LS.
4.6.3 Effect of the Aggregation Model Figure 4.4 shows effect of storage space consumption as a function of aggregation ratio. As expected, the amount of storage space consumed in the case of CBCS protocol is proportional to the extent of aggregation ratio. Collaborative storage management will work well for the applications with high spatio-temporal coverage. One limitation of the aggregation model we have used so far is that the required storage size under collaboration grows in direct proportion to the number of sensors in the cluster; that is, the storage consumed in a round is αN · D, where α is the aggregation ratio, N is the number of sensors and D is the data sample size. Since the available storage (N · S, where S is the available storage per sensor) is also a function of the number of sensors, storage is consumed at a rate (f racαDS) which is independent of the number of sensors present in the zone, assuming perfect load balancing. For most applications, this will not be the case: the aggregated data necessary to describe the phenomenon in the zone does not grow strictly proportionately to the number of sensors and we expect storage lifetime to be longer in dense areas than in sparse ones. 59
Storage space as a function of Aggregation Ratio 1
Local−Buffer CBCS
Normalized Mean Storage Space
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0.25
0.50
0.75
Aggregation Ratio
Figure 4.4: Storage Space vs. Aggregation Ratio
To highlight the above effect, we consider the case of a biased deployment where sensors are deployed randomly but with non-uniform density. In addition to the aggregation model considered so far, we consider a case where the CH upon receiving packets from its N members, just needs to store 1 packet. As an example if the aggregation function is to store the average value of the N samples (e.g. average temperature reading). Clearly, in the second case, the size of the aggregated data is independent of network density. We now study how these applications with different aggregation functions perform on top of a biased deployment. To model biased deployment, we consider 4 zones with 5,4,3,2 sensors respectively. In these simulations, the round time was set to 10 seconds (CH selection happens every 10 seconds). In Figure 4.5, the X-axis shows time (in multiple of 10 seconds), whereas the Y-axis shows the percentage of coverage sensors within a given zone. As described earlier we considered 4 zones for this study and each line in the Figure 4.5 represents a particular zone. For example line Z-5 stands for a zone with 5 sensors in it and Z-2 denotes the zone with 2 sensors in it and so on. As shown in Figure 4.5(a), when the aggregation ratio is a constant (0.5), all the zones provide coverage for almost same duration. However, in the second case, as shown if Figure 4.5(b), coverage is directly proportional to the network density, higher the density, longer the coverage. The sensor network coverage from a storage management perspective depends on the event generate rate, the aggregation properties as well as the available storage. If the aggregated data size is independent of the number of sensors (or grows slowly with it), the density of the zone correlates with the availability of storage resources. Thus, both the availability of storage resources as well as the consumption of them may vary within a sensor network. This argues for the need of load-balancing across zones to provide long network lifetime and effective coverage. This is a topic of future research. A lesson learned from this study is that the sensor network designer should evaluate the cost and coverage tradeoff during deployment. In particular, if the aggregation ratio is constant, then the performance of the network (in terms of coverage) will not depend on its deployment, so the designer need not spend the cost and effort of deploying the sensors with uniform density. Also, deploying more sensors in the zones with higher activity does not imply better coverage time. However, in the other case, the designer might decide to spend more money and effort to deploy more sensors in the area of higher activity to get higher coverage. In future, we would like to 60
Biased Deployment compression ratio = 0.5
Biased Deployment compression ratio = 1/n
1
1 Z−5 Z−4 Z−3 Z−2
0.9
0.8
Percentage of Active Sensors
Percentage of Active Sensors
0.8
0.7
0.6
0.5
0.4
0.3
0.7
0.6
0.5
0.4
0.3
0.2
0.2
0.1
0.1
0
Z−5 Z−4 Z−3 Z−2
0.9
0
10
20
30
40
50
60
0
70
Time (in multiple of 10 seconds)
0
10
20
30
40
50
60
70
Time (in multiple of 10 seconds)
(a) Aggregation Ratio = 0.5: Coverage
(b) Aggregation Ratio = 1/N : Coverage
Figure 4.5: Biased Deployment vs. Coverage
explore the complex interactions and tradeoffs between network topology and aggregation ratio.
4.7
Conclusion and Future Work
In this chapter, we described the problem of storage management in sensor networks where the data is not continuously reported in real-time and must therefore be stored within the network. Collaborative storage is a promising approach for storage management because it enables the use of spatial data aggregation and redundancy control among neighboring sensors to compress the stored data and optimize the storage use. Collaborative storage also allows load balancing of the storage space to allow the network to maximize the time before data loss due to insufficient memory. Collaborative storage results in lower time to transfer the data to the observer during the reach-back stage and better coverage than a simple local buffering approach. While collaborative storage reduces the energy required for storage, it requires additional communication. Using current technologies, collaborative storage requires more energy than local buffering. Network effectiveness is bound both by storage availability (to allow continued storage of collected data) as well as energy. Thus, protocol designers must be careful to balance these constraints: if the network is energy constrained, but has abundant storage, local storage is most efficient from an energy perspective. Alternatively, if the network is storage constrained, collaborative storage is most effective from a storage perspective. When the network is constrained by both, a combination of the two approaches would probably perform best. One idea we explore is coordination among the sensors. Specifically, each sensor has a local view of the phenomenon, but cannot assess the importance of its information given that other sensors may report correlated information. In this case a sensor’s local estimate of its data utility diverges from the its actual utility from global perspective. Coordination can be used in conjunction with local storage or collaborative aggregated storage. In Coordinated Local Storage (CLS), the sensors coordinate periodically and adjust their sampling schedules to reduce the overall redun61
dancy, thus reducing the amount of data that will be stored. Similarly, Coordinated Collaborative Storage (CCS) uses coordination to adjust the sampling rate locally. Similar to CBCS, the data is still sent to the cluster head where aggregation is applied. Since the cluster head has a broader view of the network, it can provide this context to its members in the form of feedback. In the case of both CLS and CCS, we showed that coordination among nearby sensors allows tracking of context related to the redundancy in the reported data. This allows dynamic, localized reconfiguration of the network (such as adjusting sampling frequencies of sensors based on estimated data redundancy and current resources). We did not examine the implications of collaborative storage on data indexing and retrieval strategies. Furthermore we did not consider the effect on the file system design: often sensor network file systems have simple sequential append access interfaces that are suitable for data logging, but not necessarily collaborative storage. These issues are among the areas of future research interest.
62
Chapter 5 Dynamic Localization Control for Mobile Sensor Networks 5.1
Introduction
Localization is the ability of a mobile device to find out its physical location – location is fundamental for sensor networks because interpreting the data collected from the network is not possible unless the physical context of the reporting sensors is known. Furthermore, location is often needed for lower level services such as routing [211], clustering [212], and data storage [156]. Existing research has focused on algorithms and infrastructure for localization: that is, how to enable autonomous sensors to discover their location. In contrast, this chapter considers localization for mobile sensors: when sensors are mobile, localization must be invoked periodically to enable the sensors to track their location. Specifically, we target the related problem of EnergyEfficient Location Tracking (LT) and the associated energy-accuracy tradeoffs. With mobility, nodes must repeatedly invoke localization to maintain an accurate estimate of their location. The more often the localization, the more accurate the location estimate. However, since there is an energy cost involved in localization, we would like to minimize the localization frequency. Thus, the localization must be carried out with a frequency sufficient to capture location within acceptable error tolerance; this is the LT problem. We emphasize that location tracking is orthogonal to localization: we are concerned with the problem of when to localize which is largely independent of the underlying localization mechanism. Thus, our work is not specific to the underlying localization mechanism used. In this research, we propose two new classes of location tracking: (1) Adaptive; and (2) Predictive. Adaptive localization dynamically adjusts the localization period based on the recent observed motion of the sensor. In the second approach, the sensors estimate the motion pattern and project its motion in the future. If the prediction is accurate, which occurs when nodes are moving predictably, estimates of location may be generated without performing actual localization; further reducing the localization frequency and thereby saving energy. Using analysis and simulations, we show that that the proposed algorithms can significantly improve the energy efficiency of LT without sacrificing accuracy (in fact, improving accuracy in most situations). Both adaptive and predictive localization represent using prior behavior to forecast when localization should be carried out. This may be thought of as using context that spans time; instead of 63
just relying on local instantaneous state, past data is used to predict when to localize next to reduce energy expenditure while preserving application requirements. The remainder of this chapter is organized as follows. Section 5.1.1 motivates the problem via real-world examples. Section 5.2 presents related work. In Section 5.3 we define the dynamic localization problem and present candidate protocols for addressing it in Section 5.4. Section 5.5 presents some analysis of the performance of the protocol under various conditions. In Section 5.6 we carry out an evaluation study of the protocols. Section 5.7 discusses the effect of unexpected change of mobility on the protocols. In Section 5.7, we discuss the tradeoffs associated with the on-demand and proactive localization mechanisms. Finally, in Section 5.9 we present some concluding remarks.
5.1.1 Motivation – Mobile Sensor Applications We motivate the need for LT with the following real network example. ZebraNet [103] is a sensor network application for wild-life tracking, in which sensors are attached to zebras. As the zebras move, sensors record various parameters providing insight into migration patterns and social structures of these species. In the proposed implementation, LT is accomplished by having sensors localize (using GPS) every three minutes. However, such a fixed sampling period cannot account effectively for different mobility patterns that the animal follows: for example, 3 minute localization period is overly aggressive for an animal that is asleep or grazing, but may be insufficient to localize an animal that is moving at high speed. Clearly, it is better to have self-configuring sensors that can adapt their LT dynamically to the animal behavior to provide an accurate energy-efficient localization. As another motivating application, consider a cellular phone company that is interested in finding out coverage (signal quality) in a customer area. Future infrastructure deployment decisions (e.g., new base stations) are driven by the collected information. At present, a common way to collect such information is to have a person comb the area measuring signal strengths at various locations. This method is uneconomical and time-consuming. One can imagine a cell phone equipped with micro-sensors measuring signal strength. Such sensors need to find out their coordinates to report the measured parameters. All subscribers carrying such cell phone can gather such information as they move around.
5.2
Related Work
Localization may be carried out in one of several ways. If the node is equipped with a Global Positioning System (GPS) card, it can determine its coordinated by receiving signals from a number of satellites. For example, sensors in Zebranet [103] use Global Positional System (GPS) based localization technique. Differential GPS requires that the node also receives signals from nearby ground reference stations. GPS cards are often too expensive and/or power hungry for embedded micro-sensors or even low end mobile devices such as PDAs. In addition, GPS does not work inside buildings where the satellite signals cannot be received. Alternative localization approaches have been proposed to allow nodes to learn their location either from neighboring nodes or from reference beacons [38, 154]. In these approaches, the node has to communicate to/from beacons and/or neighboring nodes. In the remainder of this section, we give a brief overview of some of the state-of-the art techniques which can be used for localization for static networks. 64
Tolerance Distance( d thresh )
X2,Y2
X3,Y3
X1,Y1
Figure 5.1: Mobile Sensor with Localization Points.
Bulusu et al. [38] studied signal strength based and connectivity based localization techniques in outdoor environments. In the RADAR system [31], distance is estimated based on received signal strength of an RF transmission. Cricket [154] uses concurrent RF and ultrasonic chirps and estimates distance from the relative delay between them (Time-Difference-Of-Arrival, or TDOA). Other Time based techniques include Time-of-Flight(TOA) [207] between reference point and the receiver node. Niculescu et al. [138] proposed using angle-of-arrival to estimate position. Recently, He et al. [83] also classified existing localization techniques into two categories: range-based and range-free.In range-based techniques, information such as distances (or angles) of a receiver are computed for a number of references points using one of the following signal strength or timing based techniques and then position of the receiver is computed using some multilateration technique [207]. However, range-free techniques do not depend upon presence of any such information. They also proposed range-free techniques for localization. Perhaps most similar to our work, the pervasive computing community has investigated location and activity monitoring and prediction using wearable sensors [115]. However, the focus is on the accuracy of the estimate and prediction and not on the energy cost. Furthermore, most of these works assume the presence of accelerometers which we do not assume in our research.
5.3
Problem Definition: Location Tracking
Figure 5.1 shows a sensor node in motion. At every localization point, the node invokes its localization mechanism (e.g., using GPS, triangulation based localization) to discover its current location (xi , yi ). The localization point vector is the sequence of localization points collected by a sensor is denoted Si . We assume that the localization mechanism estimates the current position with a reasonable tolerance. The uncertainty introduced by the localization mechanism is represented by the shaded circles in the Figure 5.1. With mobile sensors, in the time duration between two consecutive localization points, the error in the estimate of the location increases as the node moves (on average) increasingly further from its last location estimate. In order to control this error, localization must be repeated with enough frequency to ensure that the location estimate meets some application-level error requirements (e.g., the estimate remains within a prespecified threshold from the actual location). However, carrying out localization with high frequency drains energy. Solutions to this problem must balance the need to bound error with the cost of carrying out localization. Exploring protocols that effectively estimate location while minimizing the localization operations is the Location Tracking (LT) problem we consider in this study. We keep our analysis independent of the specific localization mechanism used. Note that dynamic control of localization is needed whether localization is carried out on demand (i.e., the node
65
queries neighbors or fixed localization nodes for localization information) or pro actively (e.g., by having localization nodes periodically transmit localization beacons, or using GPS). If localization is on-demand, the localization mechanism can be invoked when needed. Alternatively, if the localization is done periodically without control of the sensor node, the node can still control its localization frequency by deciding when to start listening to the beacons. Since receiving packets or GPS signals consumes significant energy, controlling the localization frequency also applies for such schemes. An underlying assumption in this chapter is that an accurate estimate of location is needed continuously. Such a situation would occur, e.g., if sensors are continuously collecting data. The primary tradeoff is between the observed localization error and the energy consumed. The instantaneous localization error (Eabst ) is the Euclidean distance between the reported location and the actual location.
5.4
Dynamic Localization Protocols
In this section, we introduce the proposed protocols for dynamic localization. We evaluate the following three approaches for localization: (1) Static localization: the localization period is static; (2) Adaptive localization: the localization period is adjusted adaptively, perhaps as a function of the observed velocity which can be approximated using the last two localization points; and (3) Predictive localization: in this approach, we use dead reckoning to project the expected motion pattern of the sensor based on the recent history of its motion. As mentioned before, for this work we want to isolate performance of our protocols from any specific localization algorithm. We assume that the the localization algorithm once executed gives an estimate of its current location with reasonable accuracy. Therefore error introduced because of localization itself if negligible. The focus of this chapter is not the localization algorithm but the different policies to determine invocation of the localization algorithm. Excessive invocation of the localization algorithm is not energy efficient while not invoking the algorithm enough will result in unacceptable error.
5.4.1 Static Tracking The base protocol, which we call this Static Fixed Routing (SFR), localizes every t seconds. The sensor node reports the coordinates discovered in its most recent localization as its current location. This protocol is simple and its energy expenditure is independent of mobility; however, its performance varies with the mobility of the sensors. Specifically, if a sensor is moving quickly, the error will be high; if it is moving slowly, the error will be low, but the energy efficiency will also be low.
5.4.2 Adaptive Tracking In these protocols, a sensor adapts its localization as a function of its mobility: the higher the observed velocity, the faster the node should localize to maintain the same level of error. The protocol we investigate is called Dynamic Velocity Monotonic (DVM) tracking. In DVM, whenever a node localizes, it computes its velocity by dividing the distance it has moved since the last localization point by the time that elapsed. Based on the velocity, the next localization point is scheduled at the time when a prespecified distance will be traveled if the node continues with the same velocity. 66
err>thresh S1
LC
err>thresh errthresh
err>thresh err