Adaptable Protocol Stack Architecture for Future Sensor. Networks. A Thesis.
Presented ..... Difference between current and future WSN systems . . . . . . . . . . . . .
5.
Adaptable Protocol Stack Architecture for Future Sensor Networks
A Thesis Presented to The Academic Faculty by
Rajnish Kumar
In Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy
College of Computing Georgia Institute of Technology December 2006
Adaptable Protocol Stack Architecture for Future Sensor Networks
Approved by:
Dr. Umakishore Ramachandran, Advisor College of Computing Georgia Institute of Technology Dr. Mostafa Ammar College of Computing Georgia Institute of Technology Dr. Brian Cooper College of Computing Georgia Institute of Technology
Dr. Raghupathy Sivakumar School of Electrical and Computer Engineering Georgia Institute of Technology Dr. Willy Zwaenepoel School of Computer and Communication Sciences EPFL
Date Approved: Aug 25th, 2006
TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . LIST OF FIGURES
vi
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
vii
SUMMARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
xi
1
2
3
INTRODUCTION: AN OVERVIEW . . . . . . . . . . . . . . . . . . . . .
1
1.1
Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
1.2
Research Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.2.1
Technology Trends . . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.2.2
Application Trends . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
1.3
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.4
Scope of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
1.5
Outline of the Thesis and Research Contributions . . . . . . . . . . . . . .
6
SENSORSTACK: THE PROPOSED ARCHITECTURE . . . . . . . .
9
2.1
Motivating Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
2.1.1
Augmenting Situation Awareness in Battlefield . . . . . . . . . . .
9
2.1.2
Surveillance application for Homeland Security . . . . . . . . . . .
10
2.1.3
Robot Coordination and Peer-to-Peer Decision making . . . . . . .
11
2.1.4
Security Analysis of Networked Systems . . . . . . . . . . . . . . .
12
2.2
FWSN Requirements and IP Stack . . . . . . . . . . . . . . . . . . . . . .
12
2.3
SensorStack Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
14
2.4
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
17
NETWORK LEVEL ADAPTABILITY . . . . . . . . . . . . . . . . . . . .
18
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
18
3.2
Application Context and Requirements . . . . . . . . . . . . . . . . . . . .
20
3.2.1
Architectural Assumptions . . . . . . . . . . . . . . . . . . . . . . .
21
3.2.2
DFuse Architecture Components
. . . . . . . . . . . . . . . . . . .
22
3.2.3
Fusion Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
3.3
Placement Module 3.3.1
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
25
The Role Assignment Problem . . . . . . . . . . . . . . . . . . . . .
26
iii
3.3.2
Cost Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.3.3
In Search of Optimality . . . . . . . . . . . . . . . . . . . . . . . . .
29
3.3.4
The Role Assignment Heuristic . . . . . . . . . . . . . . . . . . . .
31
3.3.5
Analysis of the Role Assignment Heuristic . . . . . . . . . . . . . .
36
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.4.1
Fusion Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
39
3.4.2
Placement Module . . . . . . . . . . . . . . . . . . . . . . . . . . .
40
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
41
3.5.1
Application Level Measurements of DFuse . . . . . . . . . . . . . .
41
3.5.2
Simulation-based Study of Large Networks and Applications . . . .
45
3.6
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
47
3.7
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
NODE LEVEL ADAPTABILITY . . . . . . . . . . . . . . . . . . . . . . .
51
4.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
51
4.2
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
55
4.3
Organization and Information Taxonomy
. . . . . . . . . . . . . . . . . .
57
4.3.1
Organization and Information Sharing . . . . . . . . . . . . . . . .
58
4.3.2
Taxonomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
Information Exchange Service Design . . . . . . . . . . . . . . . . . . . . .
61
4.4.1
Design Goals
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
4.4.2
IES Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . .
62
4.4.3
Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
4.5.1
IES in TinyOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
70
4.5.2
IES in Linux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
72
4.6.1
Micromeasurements . . . . . . . . . . . . . . . . . . . . . . . . . . .
73
4.6.2
Macro Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
4.7
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
4.8
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
83
3.4
3.5
4
4.4
4.5
4.6
iv
5
INFORMATION DISSEMINATION SERVICE: IES ACROSS NODES 85 5.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
85
5.2
Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
86
5.2.1
Research Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
87
5.3
Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.4
IDS Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.4.1
Region Creation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
5.4.2
Data dissemination . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
89
5.5.1
IDS Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
90
5.5.2
Quality of Decision . . . . . . . . . . . . . . . . . . . . . . . . . . .
92
5.5.3
Comparison summary . . . . . . . . . . . . . . . . . . . . . . . . . .
93
Supporting global queries over IES data . . . . . . . . . . . . . . . . . . .
93
5.5
5.6 6
7
INFORMATION DISSEMINATION SERVICE: BULK-DATA BROADCAST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
97
6.2
Background and Related Work . . . . . . . . . . . . . . . . . . . . . . . . .
101
6.3
FBcast Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
103
6.4
FBcast Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
105
6.4.1
Network model and assumptions . . . . . . . . . . . . . . . . . . . .
108
6.4.2
FBcast without any rebroadcast . . . . . . . . . . . . . . . . . . . .
109
6.4.3
FBcast with probabilistic rebroadcast . . . . . . . . . . . . . . . . .
112
6.4.4
Protocol Extension with Repeaters . . . . . . . . . . . . . . . . . .
116
6.5
Discussion and Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . .
120
6.6
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
121
CONCLUSIONS AND FUTURE WORK . . . . . . . . . . . . . . . . . . 126 7.1
Directions for Future Research . . . . . . . . . . . . . . . . . . . . . . . . .
127
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
v
LIST OF TABLES 1
Difference between current and future WSN systems . . . . . . . . . . . . .
5
2
Impact of sharing neighborhood information . . . . . . . . . . . . . . . . . .
55
3
Cross-layer Information Produced by Different Protocol Layers . . . . . . .
57
4
Example XML descriptions for a taxonomy of cross layer information . . . .
60
5
Simulation Parameters for the IDS experiments . . . . . . . . . . . . . . . .
90
6
Hardware Platform Evolution . . . . . . . . . . . . . . . . . . . . . . . . . .
98
vi
LIST OF FIGURES 1
An example application. (A) pictorial representation of the task graph (with expected data flow rates on the edges); (B) textual representation of the task graph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4
Integrated Digital Video Surveillance System: A useful technology for situation awareness in battlefields. . . . . . . . . . . . . . . . . . . . . . . . . .
10
3
A robot coordination application.
11
4
Security monitoring of a wide area network.
. . . . . . . . . . . . . . . . .
11
5
SensorStack: A proposed FWSN stack. The left half of the Figure shows the layers of the stack as well their relationship to the modules that are common to the layers. The right half lists the functionalities provided by the respective modules shown in the left half. . . . . . . . . . . . . . . . . . . .
14
6
DFuse Architecture
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
24
7
An example task graph using the fusion channel abstraction. . . . . . . . .
24
8
Mapping a task graph using minimum Steiner tree (MST): Example shows that MST does not lead to an optimal mapping. For Gn , the edge weights can be thought of as hop counts, and for Gt , as transmission volume. Edge weights on the overlay graphs (c and d) are obtained by multiplying the edge weights of the task graph with those the corresponding edge weight of the network links. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
30
An example failure scenario showing a task graph overlaid on the network. An edge in this figure may physically comprise of multiple network links. Every fusion point has only local information, v iz. identities of its immediate consumers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
35
10
Linear Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
11
Triangular Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
12
Fusion Module Components
. . . . . . . . . . . . . . . . . . . . . . . . . .
38
13
iPAQ Farm Experiment Setup. An arrow represents that two iPAQs are mutually reachable in one hop. . . . . . . . . . . . . . . . . . . . . . . . . .
41
Comparison of different cost functions. Application runtime is normalized to the best case (MT2), and total remaining power is presented as the percentage of the initial power. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
The network traffic timeline for different cost functions. X axis shows the application runtime and Y axis shows the total amount of data transmission per unit time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
Effect of data contraction on initial mapping quality. Network grid size is 32X32. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
2
9
14
15
16
. . . . . . . . . . . . . . . . . . . . . . .
vii
17
Effect of input task graph size upon total transmission cost of initial mapping. Network grid size is 32X32. . . . . . . . . . . . . . . . . . . . . . . . . . . .
45
Effect of input task graph size upon control overhead of initialization algorithms. Network grid size is 32X32. . . . . . . . . . . . . . . . . . . . . . . .
46
Effect of network size upon control overhead of the initialization algorithms. Average distance of the sink from the sources is kept in proportion to the grid width. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
46
20
An example protocol stack configured to run the fusion application. . . . .
55
21
Cross-layer information exchange . . . . . . . . . . . . . . . . . . . . . . . .
58
22
SensorStack: A proposed FWSN stack (reproduced from Chapter 2). . . . .
59
23
IES architecture. Top half of the diagram shows the Data Management Module (DMM), while the bottom half shows the Event Management Module (EMM). Note that EMM acts as a subscriber to DMM component. . . . . .
63
24
IES API summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
25
Use of asynchronous signaling in IES when requested attribute is not available in IES. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
67
26
Use of asynchronous signaling in IES for handling periodic updates. . . . .
68
27
DMM memory hierarchy. Direct-mapped cache maps an attribute to its location in the data bank. . . . . . . . . . . . . . . . . . . . . . . . . . . .
71
Comparing information access latency using HSN’s neighbors interface and using IES interface. In IES case, the neighborhood data is being accessed directly from the DMM cache. . . . . . . . . . . . . . . . . . . . . . . . . .
73
Memory access overhead for DMM in TinyOS. Figure (A) shows the results when attribute is present in the data bank, and Figure (B) shows the results when the attribute is not in IES. . . . . . . . . . . . . . . . . . . . . . . . .
74
Memory access overhead comparison for DMM in TinyOS. For IES case, 32-way set associative data bank is used. . . . . . . . . . . . . . . . . . . . .
75
DMM performance on Linux: For (A), the direct-mapped cache size is fixed at 16 entries and the data bank size is varied; for (B), the data bank size is fixed at 256 entries and the direct-mapped cache size is varied. In legends, store size represents the size of the data bank, and CPU time represents the latency incurred in accessing an attribute from IES. . . . . . . . . . . . . .
76
EMM performance on Linux: Time spent in checking the rules is very minimal compared to the time taken by a handler routing in serving the notification event. In legends, “number of matching rules” represents the number of rules to be evaluated, and time represents the delay incurred for different steps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
77
18 19
28
29
30 31
32
33
Task graph overlay over a grid topology: 11x11 network with 20 feet spacing. 80
viii
34
Network cost improvement with time because of role migration using cost function MT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
81
Potential saving in control overhead that can be achieved by information sharing via the IES. In legends corresponding to the fusion layer overhead, 4,12,24 are the neighborhood sizes, i.e. number of nodes which are queried for their health information. period is a configurable parameter, that represents the periodicity of neighborhood information exchange among nodes. . . . .
82
IDS overhead for executing multiple queries. Figure A shows the overhead for a grid topology with 10’ spacing, and figure B shows that for a grid with 15’ spacing. pbcast incurs low overhead than explicit collection using multiple diffusion trees for different queries. . . . . . . . . . . . . . . . . . . . . . . .
91
Reliability in terms of no. of responses received by the query initiators (region heads). Figure A shows the overhead for a grid topology with 10’ spacing, and figure B shows that for a grid with 15’ spacing. . . . . . . . . . . . . . .
92
FBcast at source: m original packets are encoded into n packets and injected in the network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
103
FBcast at recipients: k packets are chosen from received queue and decoded back to m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
103
The effect of forward error correction (FEC) on reliability for varying biterror probabilities using a simple radio model. Figure A compares a stretch factor of 2.0 versus the source retransmitting the message twice. Figure B corresponds to similar experiments, but with a stretch factor of 4.0 and four retransmissions. The network consists of 121 motes deployed using the simple radio model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
110
Effect of forward error correction (FEC) on reliability using the empirical radio model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
111
42
Effect of inter-mote spacing on reliability. . . . . . . . . . . . . . . . . . . .
112
43
Topographical picture of reliability for two typical runs showing how the reliability decreases as we move away from the grid center. 121 motes placed on 11x11 grid with inter-mote spacing of 5 feet. . . . . . . . . . . . . . . .
113
Pbcast performance for 121 motes deployed on a 11x11 grid with varying inter-mote spacing (x-axis). . . . . . . . . . . . . . . . . . . . . . . . . . . .
113
FBcast (with probabilistic rebroadcast) performance for 121 motes deployed on a 11x11 grid with varying inter-mote spacing (x-axis). . . . . . . . . . .
114
35
36
37
38 39 40
41
44 45 46
Transmission overhead comparison of Pbcast and FBcast for a simple topology.114
47
Effect of stretch factor on reliability. Mote spacing = 10 feet. 121 motes deployed on a 11x11 grid. Forwarding probability is varied along x-axis. . .
116
The effect of injection rate: how interference causes the reliability to drop at higher forwarding probability, though the amount of retransmissions increases as expected. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
117
48
ix
49
Probabilistic broadcast(p = 0.8) results for multi-hop scenario. The X axis shows the total number of motes deployed uniformly with inter-mote spacing of s = 6’ and 10’. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
118
Topography of successful reception for motes deployed with 10’ spacing in a 200’x200’ area. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
123
Performance of FBcast (n = 40, p = 2/40) with repeaters for 441 motes deployed with inter-mote density s = 6’ and 10’. . . . . . . . . . . . . . . .
124
52
Pseudocode for FBcast with repeater. . . . . . . . . . . . . . . . . . . . . .
124
53
FBcast with repeaters for motes deployed with 10’ spacing in a 200’x200’ area: (A) shows the coverage, and (B) shows the overhead in terms of number of repeaters and their positions. . . . . . . . . . . . . . . . . . . . . . . . . . .
125
Performance of pbcast (p = 0.8) with repeaters for 441 motes deployed with inter-mote density s = 6’ and 10’. . . . . . . . . . . . . . . . . . . . . . . .
125
50 51
54
x
SUMMARY
Current technology trends suggest that futuristic wireless sensor networks (FWSN) are well equipped to support applications such as video surveillance and emergency response that have, in addition to high computation and communication requirements, a need for dynamic data fusion. Because of the dynamic and heterogeneous nature of FWSN environment, adaptability is emerging as a basic need in designing network protocols. Adaptability can be broadly supported at two levels: network level and node level. At the network level, different nodes are the entities that adapt their roles in a cooperative manner to improve network level cost metrics. Similarly, at the node level, different protocol modules are the entities that adapt their behavior in a holistic manner to best perform the roles assigned to the node. The goal of this thesis is to provide an adaptable protocol stack architecture for data fusion applications. Towards this goal, the thesis presents a software architecture called SensorStack, that raises and answers three set of questions. First, towards network-level adaptability, how best to dynamically adapt the placement of a fusion application task graph on the network? How best to support such an adaptation in an application-specific manner? We have designed a distributed role assignment algorithm and implemented it in the context of DFuse, a framework for distributed data fusion. Simulation results show that the role assignment algorithm significantly increases the network lifetime over static placement. Second, towards node-level adaptability, how best to facilitate cross-layering on a node to foster agile adaptation of a node’s behavior commensurate with the network-level changes? SensorStack provides information exchange service (IES), a framework for cross-module information exchange. IES can be thought of as a centrally controlled bulletin-board within each node where different modules can post available data, or request information, and get notification when information becomes available. IES preserves the benefits of layering
xi
while facilitating adaptability. IES has been implemented in TinyOS and Linux, to show both the feasibility of the design as well as demonstrate the utility of cross-layering to increase application longevity. Third, how best to integrate node-level and network-level adaptability? How best to design a scalable dissemination protocol for sharing information across nodes? Towards tying the network and node level adaptability together, control data published in IES needs to be shared across the network. SensorStack uses a probabilistic broadcast-based dissemination service (IDS) for control data. Simulation experiments show that IDS allows nodes to share IES data more efficiently than using multiple diffusion trees. More broadly, an efficient and reliable dissemination of information over a large area is a critical ability of a sensor network for various reasons such as software updates and transferring large data objects (e.g., camera images). Thus, for bulk data broadcast, we design FBcast, an extension of the probabilistic broadcast based on the principles of modern erasure codes. Simulation results on TOSSIM show that FBcast offers higher reliability with lower number of retransmissions than traditional broadcasts.
xii
CHAPTER 1
INTRODUCTION: AN OVERVIEW
With the rapid advance in technology, it is becoming increasingly feasible to realize a small footprint sensor platform with a fast processor, a sizable memory, and a radio transceiver in addition to the sensors themselves. There is an ever-evolving continuum of sensing, computing, and communication capabilities from smartdust, to sensors, to mobile devices, to desktops, to clusters. With this evolution, capabilities that are usually the purview of larger footprint devices are becoming available in smaller footprint devices as well. For example, tomorrow’s mote will have comparable computational resources as today’s mobile devices; and tomorrow’s mobile devices will have comparable resources to current day desktops. These trends suggest that futuristic wireless sensor networks (FWSNs) are well equipped to support applications such as video surveillance and emergency response that have, in addition to high computation and communication requirements, a need for rapid dynamic deployment. FWSN is rapidly creating an opportunity to support advanced sensing applications, that use high-bit rate sensors, that apply sophisticated operations on in-network data of different types, and that are distributed in nature. Supporting such fusion applications in FWSN environments presents many new challenges. Fusion applications in their distributed form demand an execution of a sense-fuse-actuate loop; but, thus far sensor network research has mainly focused on sense-aggregate-transmit functionality, where aggregation is done on the same type of data to save communication. Thus, one clear challenge is the problem of distributed role assignment, i.e., how to designate nodes to do fusion, or data forwarding, such that application-specific requirements are met. Other challenges emanate from the inherent nature of FWSN environment. FWSN environment is heterogeneous and dynamic. Heterogeneity in FWSN can be attributed to recent technological advances and economics; it is more useful to deploy a network of devices of different capabilities. Also because
1
of the dynamic nature of resource usage, even a homogeneous deployment can become heterogeneous in terms of resource availability. Because of the dynamic and heterogeneous nature of the FWSN environment, adaptability is emerging as a basic need in designing network protocols for FWSN. Adaptability can be broadly supported at two levels: network level and node level. At the network level, different nodes are the entities that adapt their roles in a cooperative manner to improve network level cost metrics. Similarly, at the node level, different modules of the per-node protocol stack are the entities that adapt their behavior in a holistic manner to best perform the role of the node. The goal of this thesis is to investigate network and node level adaptability in the protocol stack in an efficient manner.
1.1
Problem Statement
The challenges identified above, coupled with the centrality of in-network data fusion functions in sensor networks, form the basis for the intellectual core of this thesis. In particular, we believe the time is ripe for investigating the software architecture for such futuristic sensor networks. Such an architecture will address the following key issues: • Distributed role assignment to provide network-level adaptability: How to dynamically adapt the placement of an application task graph on the network? How to support such an adaptation in an application-specific manner? How to handle failures without incurring expensive communication overhead? • Structured cross layer information sharing to provide node-level adaptability: How to facilitate cross-layering on a node to foster agile adaptation of a node’s behavior commensurate with the network-level changes? What information should different protocol modules share with each other and how to achieve sharing incurring minimal latency? • Information sharing among neighbor nodes to help adaptability and increase network lifetime: How to support sharing of control data among nodes to support adaptability?
2
How to achieve such information dissemination while incurring a justifiable communication overhead?
1.2
Research Context
This section presents the context for this research and assumptions made thereof. 1.2.1
Technology Trends
High-bit rate sensing is ubiquitous:
CMOS camera technology is improving rapidly
with enhanced functionalities through efficient on-chip integration, power-saving technologies, and light weight. Even with a small form factor amenable to cell phones, unlike charge-coupled device (CCD) cameras, CMOS cameras are providing enough resolution that camera phones have become ubiquitous today. For example, Samsung is introducing a 5M-Pixel resolution CMOS camera, that consumes less power and cheaper than its CCD counterpart [1]. Today’s handhelds are tomorrows motes: On the hardware end, processing technology has also been improving rapidly with an ever increasing number of computation cycles and memory at hand. For example, while Mica2-based Berkeley motes [50] introduced in 2003 only had a 7MHz 8-bit ATmega128L microcontrollers with a 4KB RAM / 128KB FLASH, today’s Intel Mote2 [29] has a 520MHz PXA271 32-bit XScale processor with a 32MB SDRAM / 32MB Flash. Though demand for smaller footprint size and lower cost will continue putting constraints on the available resources on sensor devices, we believe that there will exist a continuum in the capabilities of the FWSN devices. Wireless bandwidth and energy continue to be scarce: Though there is an ample increase in the bandwidth support from Mica2 motes (19.2 kbps) to Intel-Mote2 (250 kbps), because of the wireless nature of the medium and harshness of sensor network deployments, actual observed bandwidth is not as high. Also, because of the ad hoc nature of FWSN deployments, many FWSN devices may be battery driven, making energy an important resource to be conserved. Overall, trends suggest that communication will continue to be a resource to be optimized.
3
1.2.2
Application Trends 3 6
2x 2
3x x
x 1
2x
Collage Filter
Sink (Display)
Camera Sesnors
(A)
(B)
Figure 1: An example application. (A) pictorial representation of the task graph (with expected data flow rates on the edges); (B) textual representation of the task graph. Distributed fusion is gaining importance: Advances in sensing capabilities, and computing and communication infrastructures are paving the way for new and demanding fusion applications. Video-based surveillance, emergency response, disaster recovery, habitat monitoring, and telepresence are all examples of such applications. These applications in their full form are capable of stressing the available computing and communication infrastructures to their limits. Fusion applications, as we refer to such applications in this thesis, have the following characteristics: (1) they are continuous in nature, (2) they require efficient transport of data from/to distributed sources/sinks, and (3) they require use of in-network fusion and filtering to carry out compute-intensive tasks in a timely manner. Consider for example, a video-based surveillance application. Cameras are deployed in a distributed fashion; the images from the cameras are filtered in some applicationspecific manner, and fused together in a form that makes it easy for an end user (human or some program) to monitor the area. The compute intensive part may analyze multiple camera feeds from a region to extract higher level information such as “motion”, “presence or absence of a human face”, or “presence or absence of any kind of suspicious activity”. Figure 1 shows the task graph for this example application, in which filter and collage are fusion functions that transform input streams into output streams in an application specified manner.
4
Table 1: Difference between current and future WSN systems Current Fusion Systems
Future Fusion System
Application characteristics
Mostly static, low bit-rate communication, simple computations
Fusion Requirement Optimization
Centralized decision Application-independent, e.g. energy Temperature monitoring Infrastructure based or small ad hoc network Homogeneous, mostly static
Often dynamic, high bit-rate communication, and arbitrarily complex computations In-network distributed decision Application-specific: different goals and priorities Battlefield decision support Large ad hoc and potentially unreliable network Heterogeneous, Mobile
Example Application Communication Network
1.3
Related Work
With FWSN being the focus, the proposed SensorStack research is complementary to other network architecture projects. For example, a project from UC-Berkeley emphasizes lowering the systems interface for sensor networks [16], by techniques such as a unified link layer abstraction [61]. Their focus is on low bitrate applications. While the link layer abstraction is well thought out, it should be clear that simply lowering the system interface, without any standard approach to inter-module interactions, makes holistic optimizations and adaptability difficult. Such designs may work well for a homogeneous and quasi-static deployment, wherein the main focus is in dealing with resource constraints, and not on providing any dynamic adaptability. Our primary focus in this thesis is on high bitrate applications. Our emphasis on a holistic approach to a layered architecture, with system support for cross-layering, is aimed at heterogeneous and dynamic deployments that demand very high adaptability.
1.4
Scope of the Thesis
This thesis does not consider the following aspects that are related to the theme of this research: • Adaptable routing protocols for node-to-node communication: Considering the communication centric nature of FWSN applications, data routing is an integral part of the FWSN network stack. An ideal routing protocol for FWSN will adapt its routing
5
decisions to satisfy the application requirements. This thesis looks into systems and network services that will help design an adaptable routing protocol, but we do not get into designing any novel node-to-node routing protocol. There is a considerable amount of research related to power-aware routing that can be used in the FWSN stack. • Medium access control (MAC) layer adaptability: This thesis does not address MAC layer issues and duty cycle control. Design of an adaptable MAC protocol for sensor network using cross-layer information has been studied by Compass project [55] and Van et al. [72]. • Adapting physical characteristics: Transmission range control and mobility can be used to affect the network topology and improve the FWSN performance. We do not look into how to achieve adaptability using such mechanisms. • Providing real-timeliness and quality of service (QoS) guarantees to the application layer: QoS support is desirable for FWSN applications, but providing such guarantees in the presence of dynamism is a challenge. We have not explored QoS issues in this thesis. • Safety issues of executing fusion code inside kernel: When we allow applicationsupplied fusion functions to run in the network stack, data fusion support becomes even more challenging because of the code security concerns. Systems techniques like Sand-boxing and special limitations on the programming for safety reasons [75] can be useful here, but further investigation into the safety issues is beyond the scope of this thesis.
1.5
Outline of the Thesis and Research Contributions
This thesis starts with the design of an adaptable protocol stack, SensorStack, for supporting fusion applications in FWSN environment. Though SensorStack is not the only way to design a protocol stack, it serves as an example of laying out different sensor network
6
protocols together to facilitate adaptability. It also gives a context to describe individual techniques, that are the main contributions of this thesis, presented in the following order. • Design, development, and evaluation of a distributed algorithm for role assignment: Towards network-level adaptability in FWSN, Chapter 3 presents a distributed role assignment algorithm in the context of the DFuse system. DFuse is an architectural framework for dynamic application-specified data fusion in sensor networks. It bridges an important abstraction gap for developing advanced fusion applications that takes into account the dynamic nature of applications and sensor networks. Elements of the DFuse architecture include a fusion API, a distributed role assignment algorithm that dynamically adapts the placement of the application task graph on the network, and an abstraction migration facility that aids such dynamic role assignment. Experimental evaluations show that the API has low overhead, and simulation results show that the role assignment algorithm, which is the focus of the chapter, significantly increases the network lifetime over static placement. • Identification, implementation, and evaluation of cross-layer support in SensorStack: Towards node-level adaptability in FWSN, chapter 4 presents a novel service, information exchange service (IES), as a framework for cross-module information exchange. IES is a centrally controlled bulletin-board where different modules can post available data, or request for useful information, and get notified when the information becomes available. IES is integrated into the proposed SensorStack architecture that preserves the benefits of layering while facilitating adaptability. IES has been implemented in TinyOS and Linux, to show both the feasibility of the design as well as to demonstrate the utility of cross-layering to increase application longevity. • Design, implementation, and evaluation of a dissemination service for control data across FWSN nodes: Towards tying the network and node level adaptability together, control data published in IES needs to be shared across the network. Chapter 5 presents a information dissemination service (IDS) to facilitate the exchange of control data. To minimize the communication overhead inherent in sharing control data 7
in a distributed manner, IDS uses probabilistic broadcast and query-aware clustering. Evaluation experiments show that using probabilistic broadcast provides low communication overhead than using multiple diffusion trees for communicating the IDS data. • Design, implementation, and evaluation of a broadcast protocol for improving reliability in FWSN: To support in-network fusion of sensed data, applications may need to disseminate data from a central command control to the network, with varying needs for reliability. Similarly, IES control data may need to be disseminated within a cluster with different reliability requirements. Chapter 6 presents FBcast, a new broadcast protocol based on the principles of modern erasure codes. FBcast exposes a set of tunable parameters such that different applications may adjust the protocol behavior to suit their reliability requirements. Since a lower reliability need incurs lower transmission overhead, an application can save upon communication by tuning the FBcast behavior dynamically. The tunability of FBcast makes it an important tool in the bag of tricks of an adaptable protocol stack. We show that our approach provides high reliability with lower number of retransmissions than traditional broadcasts.
8
CHAPTER 2
SENSORSTACK: THE PROPOSED ARCHITECTURE
This chapter explains SensorStack, our proposed architecture for FWSN protocol stack. Because of the differences between FWSN requirements and current FWSN or Internet applications, this chapter looks into what new features are needed in SensorStack. The following presentation is expected to serve two purposes. First, it will give background information and motivation for a new protocol stack. Second, it will give a context to present ideas explored in this thesis.
2.1
Motivating Applications
Advances in sensing capabilities, and computing and communication infrastructures are paving the way for new and demanding applications. Video-based surveillance, emergency response, disaster recovery, habitat monitoring, and telepresence are all examples of such applications. We set the context for our data fusion architecture by briefly surveying these application scenarios. 2.1.1
Augmenting Situation Awareness in Battlefield
The quality of a command decision can be significantly improved by increasing situation awareness. Army personnel, trying to shoot an enemy target, would like to have a comprehensive assessment of the ground situation. They would like to be aware of an aerial view of the field, positions of friends and foes, and attributes of the enemy target (such as direction and velocity of movement if it is a mobile target) from sensor data. Ideally, all such information should be fused together and presented in a form that is easily actionable for a soldier on the battlefield. Given that such personnel have to act in an expeditious manner [5], the attention span for dealing with raw data is minimal if not non-existent. Situationspecific digest of distributed sensor data and collected intelligence needs to be presented
9
in a timely manner in the form of actionable inferences to a soldier. An important consideration in this context is the priority of information, and locality of dissemination. For example, information regarding an approaching enemy helicopter is higher priority than information about troop deployment in an adjoining territory. Similarly, the soldiers who are threatened by the approaching enemy helicopter define the locality for the immediate information dissemination. Such meta data (e.g. priority and locality) have to be taken into consideration for information fusion and dissemination, and for meeting the quality of service guarantees.
Commands Alerts Big Picture Image Sensor Data
Coordination Data
Figure 2: Integrated Digital Video Surveillance System: A useful technology for situation awareness in battlefields. For example, Integrated Digital Video Surveillance System (IDVS), shown in Figure 2, is a technology in development [6] that can combine different CMOS images taken from ground and aerial camera sensors to present an integrated image to ground troops. A centralized host computer fuses the images for presentation. Data from remote camera sources must be transported to the centralized host, and doing so consumes considerable amount of energy and bandwidth. Nevertheless, such a system forms an important part of a tactical deployment. Therefore, system mechanisms that scale with the size of the deployment are important. 2.1.2
Surveillance application for Homeland Security
A video-based surveillance system is an attractive solution for threat identification and reduction. Figure 1(a) shows the task graph for an example surveillance application. Data obtained from heterogeneous sensors can be fused with the camera streams to refine the 10
in-network processing shown in the task graph. Time may be the meta-data of importance for information fusion and dissemination in such applications. Sensor Data Command Sensor Data Sensor Data Sensor Data Sensor Data
Figure 3: A robot coordination application.
2.1.3
Robot Coordination and Peer-to-Peer Decision making
Robots can be very useful in reconnaissance missions, hazardous situations, disaster recovery, and situations that are physically challenging. An effective operation may need to collect data from multiple sensors and information sources, and fuse them together to generate higher-level inferences. Robot teams may function in a peer-to-peer networked mode, or in concert with humans in such applications for intelligent decision-making. The meta data of importance for information fusion and dissemination in such applications may be a combination of location and time.
Security Agent Alerts Fusion
Figure 4: Security monitoring of a wide area network.
11
2.1.4
Security Analysis of Networked Systems
With the increased reliance on information dissemination using public networks, there is an increased threat to the integrity and safety of the transmitted information in military applications, electronic commerce, and personal privacy. Organizational networks (such as large businesses or US defense forces) comprise of thousands of nodes, distributed throughout the US and the rest of the world. Typically, gateway servers in such networks may need to deploy intelligent security agents to check for denial-of-service attacks, filtering worms and viruses, and other security and integrity checks. Figure 4 shows a typical wide area networked system. Any potentially suspicious event observed in different gateways may have to be collected and analyzed for possible correlation in the pattern of events. Bringing all the events to a centralized analyzer is a huge overhead on the backbone network, and therefore such events are often ignored, or sampled at a very low periodicity to decrease the communication overhead. As information flows through the Internet, it would be ideal if the information is analyzed in the network (the closer to the source the better) to detect any cyber attacks, fuse knowledge of other potential threats, take local corrective actions (such as drop certain packets and/or raise an alarm), and disseminate warning to other nodes in the network. The meta data of interest for fusion and dissemination in such an application includes perhaps information such as a particular IP address and/or a port number.
2.2
FWSN Requirements and IP Stack
The design as well as the dynamic use of futuristic sensor networks offers several challenges from a software architecture point of view. First of all, the network will be comprised of nodes with different computation, communication, and sensing capabilities. Such a mix would allow the redundant deployment of nodes with limited capabilities in harsh environments (e.g. jungle terrains) and nodes with sophisticated capabilities in mild environments (e.g. safe neighborhoods). Second, the energy level of the nodes of the network will vary both with their inherent capabilities as well as their dynamic use (certain network paths may be more heavily used than others). Since energy will continue to be the most important consideration for prolonging network lifetime it is important to ensure that the network is
12
load balanced to avoid partitioning the network. The networks will be used simultaneously for a diverse set of applications. For example, imagine a metro area blanketed with an array of cameras. The video from the cameras may be used for a variety of applications including traffic monitoring, video surveillance, emergency response, and urban warfare. Thus a third challenge has to do with the diversity in the computation and communication requirements of applications. For example, a traffic monitoring application may require reliable data delivery for accurate modeling of traffic patterns, while a surveillance application may require best effort real-time delivery of fresh data. A fourth challenge arises due to the inherent pattern of communication in the application. In some applications (e.g. generating temperature contours of a terrain) the communication may be data driven, while in some others (traffic advisory) the communication may be topological. Finally, sensor networks differ from traditional networks (such as wide area networks carrying TCP/IP traffic) in one fundamental way. For example, the TCP/IP protocol architecture is designed for end-to-end communication with intermediate nodes performing a simple relay function. Sensor networks typically require intermediate nodes to perform in-network aggregation (or data fusion) to reduce the network traffic which is a primary source of energy drain. Data fusion, if supported at the application layer, can lead to performance inefficiencies when implemented on traditional networks (due to the need to traverse all the protocol layers at each intermediate node). The requirement for in-network data fusion will be even more important in futuristic sensor networks that are expected to carry high-bandwidth stream data. The explosive growth of the Internet has been spurred to a great extent by the modularity of the layered network architecture. Adherence to the strict interfaces in the different layers, has enabled the independent development of robust protocols and their validation. While the focus on layered software architecture has been a useful design guideline for Internet protocols, it is becoming clear that the decisions taken at runtime in the different layers could be better optimized with cross layer information. This is particularly true in dynamic settings when the network conditions can change quite dramatically. For example, researchers have shown the utility of explicit congestion notification from the routers to the
13
\ Logic Application
Application Layer Data Fusion Layer
In-stack fusion Information Exchange Service
Data Service Layer
Helper Service Layer
Data dissemination, Logical naming, Packet scatter/gather
AttributeValue publish/ subscribe
Medium Access, Error Control, Radio Contro l
Medium Access Layer Radio Layer
Localization, Synchronization Service
Connection
(A) Stack Lay-out
(B) Functionalities
Figure 5: SensorStack: A proposed FWSN stack. The left half of the Figure shows the layers of the stack as well their relationship to the modules that are common to the layers. The right half lists the functionalities provided by the respective modules shown in the left half. transport layer [63], and link status information to the IP layer in a wireless setting [87].
2.3
SensorStack Design
This section presents the layers of our proposed FWSN stack, called SensorStack (see Figure 22). It uses the OSI model as a guideline for mapping the necessary functionalities into different layers. Since the FWSN protocol requirements are different from the Internet, IP layering cannot be directly applied to SensorStack. The three main mandatory services needed to the sensor applications are medium access control (MAC), logical naming and data-centric routing (data service), and data fusion. These three services are in three separate layers. Other services that can improve the performance and energy optimizations, e.g. localization service, are put together as optional service bundle (the helper service layer in Figure 22). The inter-relationship among the protocol modules shown in Figure 22-(A) is as follows: if two layers abut one another there is a well-defined interface between them; if they do not abut one another they do not have such an interface. Thus, the Data Fusion layer interfaces with the Data Services layer, the Application, and the Information Exchange Service. Howeever, it does not interface with any of the other layers. The radio and MAC layers are self-explanatory. At the heart of the SensorStack is information exchange service (IES) that serves as a data broker among the different modules of the stack and the application to facilitate cross-layer optimizations.
14
It is fairly intuitive as to what are the functionalities provided by the different layers. Details of the different modules shown in Figure 22 are summarized below. Medium Access Control Layer: This layer provides the traditional TCP/IP’s MAC layer service, i.e. medium access for hop-to-hop data transfer and channel error control, and power control of the radio antenna. While the important attributes of traditional MAC are fairness, latency, throughput and bandwidth utilization, the key aspects for sensor MAC are energy efficiency and scalability towards size and topology change. As probability of channel error is higher in a lossy wireless environment, error control in an important service to be provided by the MAC layer, whereby the application might specify the degree of reliability it needs from the MAC. Many communication requirements of the sensor applications are periodic and known beforehand such as collecting temperature statistics at regular intervals. Further, the communication pattern within the network is typically sink-directed or local communication. Hence, these predictable communication needs and limited communication abstractions can be communicated using the IES to provide contention-free medium access. Event-driven applications like fire-detection are supported using contention-based mechanisms to meet real-time constraints. A hybrid approach with both contention-based and contention-free medium access can thus provide the necessary application needs leveraging the information from IES. Data Service Layer: This layer provides two main services, namely, dissemination of (potentially fused) data to one or more neighbors, and reception of data packets for fusion or delivery to the application. To support these two services, this layer needs to implement the following functionalities: 1. Packet Scatter/Gather: If the message size is greater than the MAC layer packet size, then a message may need to be fragmented at the source, and reassembled at both intermediate fusion points as well as ultimate destination points. 2. Logical naming and filtering: This functionality resolves the logical naming of data packet addresses. This is called only for the incoming data packets. Data service layer uses the information available from IES to resolve the logical name, i.e. it matches 15
IES provided values for the packet attributes with those available in the packet itself, and it forwards the packet, for fusion consideration, to data service layer in case of a match. In case of a mismatch, the data packet is passed, for data forwarding consideration, to the next hop selection functionality. 3. Next Hop Selection: This function takes the data routing decisions. Based upon the destination’s logical name, it finds out the next hop’s logical name and updates the data packet header. Data Fusion Layer: This layer needs to support both types of fusion mechanisms: first, where the data packets are required to be fused at every hop, and second, where the data packets are to be considered for fusion only at the (application-specified) selected nodes or the destination before the data is delivered to the application. The first kind of fusion mechanism is meant for hop-to-hop data transfer, and is useful for application scenarios where every node is sensing some useful data that needs the in-network fusion [47]. The second kind is useful for supporting a more traditional way of doing fusion at the end-points, as sometimes required when only some nodes are contributing towards the information that is being sought [36]. Application and Helper Service Layer :
Application layer is responsible to support
data capturing, data presentation at the sink, executing fusion function codes at appropriate nodes, and providing other services that can be used to adapt the SensorStack for the application-specific demand. The other services can be executed either in user space, or it can be run in kernel-space (in a sandboxed manner similar to the fusion handler) in the helper service layer. We think that localization and synchronization are two important services that need to be part of SensorStack, and we have placed them in the helper service layer. The localization and synchronization service publishes location and time information to the IES. Information Exchange Service: This service acts as information bulletin board, that serves two main purposes: making one service’s information available to another service,
16
and notifying a service when some specified conditions, upon the dynamic data being produced by other services, get satisfied. This service can be provided as a publish/subscribe (pub/sub) API, though there are some basic differences between traditional pub/sub systems and the IES. First, a subscribed item in pub/sub systems, once sent to its publisher, does not need to be kept in memory. But, IES data may not have an immediate subscriber but still the data may be useful to a protocol module later in time, and so the data needs to be kept in memory as long as possible. Second, unlike traditional pub/sub systems, a published data item in IES is made available to its subscribers only in a best-effort manner. Thus, if the network bandwidth is already occupied with application data transmission, IES data may not be delivered to its remote subscribers. Since IES deals with control data, its best-effort is quite appropriate and allows the IES design to have many optimizations that may not be possible in traditional pub/sub systems.
2.4
Summary
This chapter highlights the need for rethinking the traditional TCP/IP stack for the FWSN environment. It shows that hop-to-hop requirements, with other distinguishing characteristics of FWSN, demands a new stack layering. We present the design of a new protocol stack that is suitable for sensor applications. By making the protocol layers adaptable to the application needs dynamically, the new stack will allow exploiting the application specific requirements. IES, explored in Chapter 4, can be used to easily support adaptability in different protocol modules. In the next chapter, we look closely at the fusion layer to support network-level adaptability and to understand the requirements for the IES.
17
CHAPTER 3
NETWORK LEVEL ADAPTABILITY
This chapter presents the DFuse system we have built to facilitate network level adaptability. DFuse is an architectural framework for dynamic fusion of application-specified data in sensor networks. It bridges an important abstraction gap for developing advanced fusion applications that takes into account the dynamic nature of applications and sensor networks. Elements of the DFuse architecture include a fusion API (explored in detail in a related work [80]), a distributed role assignment algorithm that dynamically adapts the placement of the application task graph on the network, and an abstraction migration facility that aids such dynamic role assignment. Simulation results show that the role assignment algorithm significantly increases the network lifetime over static placement.
3.1
Introduction
Developing a fusion application is challenging in general, for the fusion operation typically requires time-correlation and synchronization of data streams coming from several distributed sources. Since such applications are inherently distributed, they are typically implemented via distributed threads that perform fusion hierarchically. Thus, the application programmer has to deal with thread management, data synchronization, buffer handling, and exceptions (such as time-outs while waiting for input data for a fusion function) all with the complexity of a loosely coupled system. FWSN add another level of complexity to such application development - the need for being power-aware [8]. In-network aggregation and power-aware routing are techniques to alleviate the power scarcity of FWSN. While the good news about fusion applications is that they inherently need in-network aggregation, a naive placement of the fusion functions on the network nodes will diminish the usefulness of in-network fusion, and reduce the longevity of the network (and hence the application). Thus, managing the placement (and dynamic relocation) of the fusion functions
18
on the network nodes with a view to saving power becomes an additional responsibility of the application programmer. Dynamic relocation may be required either because the remaining power level at the current node is going below a threshold, or to save the power consumed in the network as a whole by reducing the total data transmission. Supporting the relocation of fusion functions at run-time has all the traditional challenges of process migration [90]. We have developed DFuse, an architecture for programming fusion applications. It supports distributed data fusion with automatic management of fusion point placement and migration to optimize a given cost function (such as network longevity). Using the DFuse framework, application programmers need only implement the fusion functions and provide the dataflow graph (the relationships of fusion functions to one another, as shown in Figure 1). The fusion API in the DFuse architecture subsumes issues such as data synchronization and buffer management that are inherent in distributed programming. DFuse framework consists of two main components: a fusion module and a placement module. The focus of this research has been the placement module, and this chapter details only the placement module. The fusion module has been explored in depth in a related work [80]. The main contributions of the placement module are summarized below: 1. Distributed algorithm for fusion function placement and dynamic relocation: There is a combinatorially large number of options for placing the fusion functions in the network. Hence, finding an optimal placement, in a distributed manner, that minimizes communication is difficult. We develop a novel heuristic-based algorithm to find a good (with respect to some predefined cost function) mapping of fusion functions to the network nodes. Also, the placement needs to be re-evaluated quite frequently considering the dynamic nature of FWSN. The mapping is re-evaluated periodically to address dynamic changes in nodes’ power levels and network behavior.
19
We specifically use the term data fusion and distinguish it from in-network aggregation. The latter is typically performed on data of the same type to miminize communication, and the operation on the data itself is fairly simple such as addition or min/max. On the other hand, data fusion is performed on data of possibly different types, and the operations may be arbitrarily complex to derive a higher level decision. Thus, fusion point placement is a more involved problem than aggregator placement. 2. Experimental evaluation of the DFuse framework: The evaluation includes microbenchmarks of the primitives provided by the fusion API as well as measurement of the data transport in a tracker application. Using an implementation of the fusion API on a wireless iPAQ farm coupled with an event-driven engine that simulates the FWSN, we quantify the ability of the distributed algorithm to increase the longevity of the network with a given power budget of the nodes. For example, we show that the proposed role assignment algorithm increases the network lifetime by 110% compared to static placement of the fusion functions. The rest of the chapter is structured as follows. Section 3.2 presents the DFuse architecture. Section 3.3 explains a heuristic-based distributed algorithm for placing fusion points in the network. This is followed by implementation details of the framework in Section 3.4 and its evaluation in Section 3.5. We then compare our work with existing and other ongoing efforts in Section 3.6, and conclude in Section 3.7.
3.2
Application Context and Requirements
A fusion application has the following characteristics: (1) it is continuous in nature, (2) it requires efficient transport of data from/to distributed sources/sinks, and (3) it requires efficient in-network processing of application-specified fusion functions. A data source may be a sensor (e.g. camera) or a standalone program; a data sink represents an end consumer and includes a human in the loop, an actuator (e.g. fire alarm), an application (e.g. data logger), or an output device such as a display; a fusion function transform the data streams (including aggregation of separate streams into a composite one) en route to the sinks.
20
Thus, a fusion application is a directed task graph: the vertices are the fusion functions, and the edges represent the data flow (i.e. producer-consumer relationships) among the fusion points (cycles - if any - represent feedback in the task graph). This formulation of the fusion application has a nice generality. It may be an application in its own right (e.g. video based surveillance). It allows hierarchically composing a bigger application (e.g. emergency response) wherein each component may itself be a fusion application (e.g. image processing of videos from traffic cameras). It allows query processing by overlaying a specific query (e.g. “show a composite video of all the traffic at the spaghetti junction”) on to the task graph. Figure 1 shows an example surveillance application, in which filter and collage are fusion functions that transform input streams into output streams in an application specified manner. The fusion functions may result in contraction or expansion of data flows in the network. For example, the filter function selects images with some interesting properties (e.g. rapidly changing scene), and sends the compressed image data to the collage function. Thus, the filter function is an example of a fusion point that does data contraction. The collage function uncompresses the images coming from possibly different locations. It combines these images and sends the composite image to the root (sink) for further processing. Thus, the collage function represents a fusion point that may do data expansion. 3.2.1
Architectural Assumptions
We have designed the DFuse architecture to cater to the evolving application needs and emerging technology trends. We make some basic assumptions about the execution environment in the design of DFuse: • The application level input to the architecture are: 1. an application task graph consisting of the data flows and relationship among the fusion functions 2. the code for the fusion functions (currently supported as C program binaries)
21
3. a cost function that formalizes some application quality metric for the sensor network (e.g. “keep the average node energy in the network the same”). • The task graph has to be mapped over a large geographical area. In the ensuing overlay of the task graph on to the real network, some nodes may serve as relays while others may perform the application-specified fusion operations. • The fusion functions may be placed anywhere in the sensor network as long as the cost function is satisfied. • All source nodes are reachable from the sink nodes. • Every node has a routing layer that allows each node to determine the route to any other node in the network. This is in sharp contrast to most current day sensor networks that support all-to-sink style routing. However, the size of the routing table in every node is only proportional to the size of the application task graph (to facilitate any network node in the ensuing overlay to communicate with other nodes hosting fusion functions) and not the physical size of the network. • The routing layer exposes information (such as hop-count to a given destination) that is needed for making mapping decisions in the DFuse architecture. It should be noted that dynamically generating a task graph to satisfy a given datacentric query plan is in itself an interesting research problem. However, the focus of this chapter is to provide the system support to meet the application requirements elaborated in the previous subsections honoring the above assumptions. 3.2.2
DFuse Architecture Components
Figure 23 shows the components of the DFuse architecture. There are two components to this architecture : fusion module, and placement module. From an application perspective, there are two main concerns: 1. How do we generate an overlay of the task graph on to the sensor network? As we mentioned earlier, some nodes in the overlay will act as relays and some will act 22
as fusion points. Since the application is dynamic (sources/sinks may join/leave as dictated by the application, new tasks may be created, etc.), and the physical network is dynamic (sources/sink may fail, intermediate nodes may run out of energy, etc.) this mapping is not a one-time deal. After an initial mapping re-evaluation of the mapping (triggered by changes in the application or the physical infrastructure) will lead to new assignment and re-assignment of the nodes and the roles they play (relay vs. fusion). We have designed a placement module that embodies a role assignment algorithm that deals with the above issues. We describe this module in Section 3.3. 2. How do we develop the fusion application? The fusion code itself (e.g. “motion detection” over a number of incoming video streams and producing a digest) is in the purview of the application developer. However, there are a number of systems issues that have to be dealt with before a fusion operation can be carried out at a given node including: (a) providing the “plumbing” from the sources to the fusion point; (b) ensuring that all the inputs are available; (c) managing the node resources (CPU and memory) to enhance performance, and (d) error and failure handling when some sources are non-responsive. In a joint work with Wolenetz et. al. [36], we have designed a fusion module with a rich set of APIs that deals with all of the above issues. A summary of fusion module capabilities is listed in Section 3.2.3. 3.2.3
Fusion Module
Fusion module provides the following functionalities to help a fusion application programmer: • Structure management to handle “plumbing” issues. • Correlation control to handle specification and collection of “correlation sets” (related input items supplied to the fusion function). • Computation management to handle the specification, application, and migration of fusion functions. 23
Application Task Graph
Fusion Code
Cost Function
Fusion Module Placement Module
Resource Monitor, Routing Layer Interface Operating System / Routing Layer Hardware Figure 6: DFuse Architecture • Memory Management to handle caching, prefetching, and buffer management. • Failure and Latency Handling: This category of capabilities primarily allows the fusion points to perform partial fusion, i.e., fusion over an incomplete input correlation set. • Status and Feedback handling to allow interaction between fusion functions and data sources such as sensors that supply status information and support a command set (for example, activating a sensor or altering its mode of operation - such devices are often a combination of a sensor and an actuator). Producers (sensors or other fusion channels)
Fusion function f() Prefetch buffer
Consumers (actuators or other fusion channels)
Figure 7: An example task graph using the fusion channel abstraction. The fundamental abstraction in DFuse is the fusion channel, shown in Figure 7. It is a named, global entity that abstracts a set of inputs and encapsulates a programmer-supplied fusion function. Inputs to a fusion channel may come from the node that hosts the channel or from a remote node. Item fusion is automatic and is performed according to a programmerspecified policy either on request (demand-driven, lazy, pull model) or when input data 24
is available (data-driven, eager, push model). Items are fused and accessed by timestamp (usually the capture time of the incoming data items). An application can request an item with a particular timestamp or by supplying some wildcard specifiers supported by the API (such as earliest item, latest item). Requests can be blocking or non-blocking. To accommodate failure and late arriving data, requests can include a minimum number of inputs required and a timeout interval. Fusion channels have a fixed capacity specified at creation time. Finally, inputs to a fusion channel can themselves be fusion channels, creating fusion networks or pipelines. Using a standard or programmer-supplied protocol, a fusion channel may be migrated on demand to another node of the network. This feature is essential for supporting the role assignment functionality of the placement module described below.
3.3
Placement Module
This module is responsible for creating an overlay of the application task graph on to the physical network that best satisfies an application-specified cost function. A network node can play one of the three roles: end point (source or sink), relay, or fusion point [4]. In our model, the end points are determined by the application. The placement module embodies a distributed role assignment algorithm that manages the overlay network, dynamically assigning fusion points to the available nodes in the network. The role assignment algorithm has to be aware of the following properties of a FWSN: • Node Heterogeneity: Some nodes may be resource rich compared to others. For example, a particular node may be connected to a permanent power supply. Clearly, such nodes should be given more priority for taking on transmission-intensive roles compared to others. • Communication Vs. Computation: Studies have shown that wireless communication is more energy draining than computation in current day wireless sensor networks [27]. It should be noted that in future wireless sensor networks wherein we expect more significant processing in the nodes (exemplified by the fusion functions), computation is likely to play an increasingly important role in determining node energy [80]. 25
• Dynamic Behavior: As we already mentioned (see Section 3.2.2), both the application and the network environment are dynamic, thus requiring role assignment decisions to be made fairly frequently during the lifetime of the application. Therefore, it is important that the algorithm have low overhead, and scale well with the size of the application and the network. The formulation of the role assignment problem, the cost function, and the proposed heuristic are sensitive to the above properties. However, the experimental and simulation results (Section 3.5) deal only with homogeneous network nodes. 3.3.1
The Role Assignment Problem
Given N = (Vn , En ), viz.,the network topology, and T = (Vt , Et ), viz.,the task graph, and an application specific cost metric, the goal is to find a mapping f : Vt → Vn that minimizes the overall cost. Here, Vn represents nodes of the sensor network and En represents communication links between them. In the task graph, Vt represents fusion functions (filter, data fusion, etc.) and Et represents flow of data between the fusion points. A mapping f : Vt → Vn generates an overlay network of fusion points to network nodes; implicitly, this generates a mapping l : Et → {e|e ∈ En } of data flow to communication links. The focus of the role assignment algorithm is to determine f ; determining l is the job of the routing layer and is outside the scope of this thesis. We use fusion function and fusion point interchangeably to mean the same thing, namely, the node that runs any fusion function. 3.3.2
Cost Functions
We describe five sample cost functions below. They are motivated by recent research in power-aware routing in mobile ad hoc networks [67, 32]. The health of a node k to host a fusion role r is expressed as the cost function c(k, r). Note that the lower the value computed by the cost function, the better the node health, and therefore the better equipped the node k is to host the fusion point r.
26
• Minimize transmission cost - 1 (MT1): This cost function aims to decrease the amount of data transmission required for hosting a fusion point. Input data needs to be transmitted from the sources to the fusion points, and the fusion output needs to be propagated to the consumer nodes (possibly going through multiple hops). For a fusion function r with m input data sources (fan-in) and n output data consumers (fan-out), the cumulative transmission cost for placing r on node k is formulated as: cM T 1 (k, r) =
m X
t(sourcei ) ∗ HopCount(inputi , k)/Rel(inputi , k)
i=1
+
n X
t(r) ∗ HopCount(k, outputj )/Rel(k, outputj )
j=1
Here, t(x) represents the amount of data transmission (in bits per unit time) of a data source x, HopCount(i, k) is the distance (in number of hops) between node i and node k, and Rel(i, j) is a value between (0 and 1) representing the reliability of the transmission link between nodes i and j. • Minimize power variance (MPV): This cost function aims to keep the power (energy) levels of the nodes the same. This is a simple cost function that ignores the actual work being done by a node and simply focuses on the remaining power at any given node for role assignment decisions. The cost of doing any work at node k is inversely proportional to its remaining power level power(k), and is formulated as: cM P V (k) = 1/power(k) • Minimize ratio of transmission cost to power (MTP): MT1 focuses on the work done by a node. MPV focuses on the remaining power at a given node. MTP aims to give work to a node commensurate with its remaining energy level, and thus represents a combination of the first two cost functions. Intuitively, a role assignment based on MTP is a reflection of how long a given node k can host a given fusion function r with its remaining power level power(k). Thus, the cost of placing a fusion function r on node k is formulated as: cM T P (k, r) = cM T 1 (k, r) ∗ cM P V (k) = cM T 1 (k, r)/power(k) 27
• Minimize transmission cost - 2 (MT2): This cost function takes into account node heterogeneity. In particular, it biases the role assignment with a view to protecting low energy nodes. It can be considered a variant of MT1, with the cost function behaving like a step function based on the node’s remaining power level. The intuition is that if a node has wall power then its cost function is the same as MT1. For a node that is battery powered, we would protect it from hosting (any) fusion point if its energy level goes below a predetermined threshold. Thus a role transfer should be initiated from such a node even if it results in increased transmission cost. This is modeled by making the cost of hosting any fusion function at this node infinity. This cost function is formulated as: cM T 2 (k, r) = ( power(k) > threshold ) ? cM T 1 (k, r) : IN F IN IT Y • Minimize Computation and Communication Cost (MCC): This cost function accounts for both the computation as well as the communication cost of hosting a fusion function: cM CC (k, r) = cM T 1 (k, r) ∗ eradio (k) (Communication energy)
+ cycleCount(k, r) ∗ f rameRate(r) ∗ ecomp (k) (Computation energy) This equation has two parts: 1. Communication energy: cM T 1 (k, r) represents the transmission cost (bits per unit time); eradio (k) is the energy per bit consumed (Joule/bit) by the radio at node k. 2. Computation energy: cycleCount(k,r) is the computation cost (total number of instructions per input data item) for executing the fusion function r on node k for a standard data input (frame) size; frameRate(r) is the number of items generated per second by r; ecomp (k) is the energy per instruction consumed (J/instruction) by the processor at node k. 28
If the network is homogeneous and if we assume that the processing of a given fusion function r is data independent, then cycleCount(k, r) and ecomp (k) are the same for any node k. In this case role assignment based on MCC behaves exactly the same as MT1. In our experimental and simulation results reported in this chapter (which does relative comparison of our role assignment algorithm to static and optimal), we do not consider MCC any further. However, absolute network lifetime will diminish if the node energy for computation is taken into account. Such a study is outside the scope of this thesis and is addressed in companion works [80, 81, 82]. It should be emphasized that the above cost functions are samples. The application programmer may choose to specify the health of a node using the figure of merit that best matches the application level requirement. The role assignment algorithm to be discussed shortly simply uses the application provided metric in its assignment decisions. 3.3.3
In Search of Optimality
The general case of mapping an arbitrary task graph (directed) onto another arbitrary processor graph (undirected) is NP-complete [20, 57]. Since DFuse treats a fusion function as a black-box, and the application-specified cost function can be arbitrary, the general problem of role assignment is NP-complete. Given specific task graphs and specific cost functions, optimal solutions can be found deterministically. For example, consider an input task graph in which all the fusion functions are data-expanding. For transmission based cost functions (M T 1, M T 2), a trivial O(1) algorithm would place all the fusion points at the sinks (or as close as possible to the sinks if there is a problem hosting multiple roles at a sink). At the other extreme, consider a task graph where all the fusion functions are data contracting. In this case, for transmission based cost functions, fusion functions need to be applied to nodes that are as close to the data sources as possible. Therefore, finding a minimum Steiner tree (MST) of the network graph connecting the sources and the sinks, and then mapping individual fusion functions to nodes close to the data sources may lead to 29
S1
2 1 2 4
S2
S1 N1
1 1
Sink
2 S1 S2
4 1
r
5
S2
Sink
Task Graph (Gt)
S1
4*1
N1 1
Minimum Steiner Tree (MST) connecting S1, S2, Sink
S1
N1
r
1*5
(b) MST
2*4
N1 r
Sink
1 2*
N2
Sink
N2
N2
(a) Network Graph (Gn)
1* 4
2
1*5
Sink
N2
S2
S2
(d) Cost(Best mapping of Gt upon Gn) = 13
(c) Cost(Best mapping of Gt upon MST) = 15
Figure 8: Mapping a task graph using minimum Steiner tree (MST): Example shows that MST does not lead to an optimal mapping. For Gn , the edge weights can be thought of as hop counts, and for Gt , as transmission volume. Edge weights on the overlay graphs (c and d) are obtained by multiplying the edge weights of the task graph with those the corresponding edge weight of the network links. a good solution, though optimality is still not guaranteed as shown in an example mapping in Figure 8. In the example, since an MST is obtained without considering the transmission requirements in the task graph, a mapping based on MST turns out to be more expensive than the optimal. Also, finding MST is APX-complete1 , with best known approximation bound of 1.55 [65]. Finally, as is illustrated in this example, MST based solutions cannot be applied for an arbitrary task graph and cost functions. Moreover, existing approximate solutions to Steiner tree and graph mapping problems are impractical to be applied to sensor networks. Such solutions assume that (1) the network topology is known, (2) costs on edges in the network are known, and (3) the network is static. None of these assumptions hold in our case: the network topology is not known and must be discovered; costs on edges are known locally but not globally (and it is too expensive to gather this information at a central planner); and even if we could find the optimal deployment, because of inherent dynamism of FWSN, we need to re-deploy. 1
APX is the class of optimization problems in NP, having polynomial time approximation algorithms
30
All of these considerations lead us to design a heuristic for role assignment. The fundamental design principle is not to rely on any global knowledge. 3.3.4
The Role Assignment Heuristic
We have developed a heuristic that adheres to the design principle of using local information and not relying on any global knowledge. The heuristic goes through three phases: initialization, optimization, and maintenance. We describe these three phases in the following subsections. 3.3.4.1
Initialization Phase
In this phase, we make a first-cut naive assignment of fusion points to the network nodes. The application has not started yet, and no application-specified cost function is used in this initial mapping. The only real input to this phase of the algorithm is the resource constraints of the network nodes (for e.g. “is a node connected to a wall socket?”, “does a node have enough processing power and memory”, etc.). The initial placement, however, has a huge impact on the quality of the steady-state behavior of the application. For example, if the initial mapping places a fusion point k hops away from an optimal placement, at least k role transfers are needed before reaching an optimum. Therefore, we adopt either a top-down or bottom-up approach based on the nature of the application task graph to make the initial placement a good starting point for the subsequent phases. Essentially, if it is known that all the fusion functions in the task graph are data contracting, then placing them as close to the data sources as possible lowers the transmission overhead. This is the bottom up approach, where we start from the leaves (sources) of the task graph and work towards the root (sink). On the other hand, if the task graph has mostly data expanding fusion functions, then placing the fusion points close to the sinks makes sense. This is the topdown approach, where the mapping starts from the root of the task graph and progresses towards the leaves. The default (in the absence of any a priori knowledge about the task graph) is to take a bottom-up approach since data contraction is more common in sensor applications. In either case, this phase is very quick. Needless to say, the mapping of sources and
31
sinks to network nodes is pre-determined. Therefore, the job of the initialization phase is to determine the mapping of the fusion points to the network nodes. In principle if the task graph has a depth k and the network has a depth n, an attempt is made in the initialization phase to place all the fusion points in at most k levels of the tree either near the sources (bottom-up) or the sinks (top-down). Top-down Approach. This phase starts working from the root (sink) node2 . Based on its resource constraints, the root node either decides to host the root fusion function or not. If it decides to host the root fusion function, then it delegates the mapping of the children subtrees to its neighbors (either in the same level or the next level). This role assumption and delegation of the subtrees progresses until all the fusion functions are placed on network nodes. Relay nodes (as required) are used to connect up the sources to the network nodes that are hosting the fusion points of the task graph closest to the leaves. It is possible that in this initial phase a node may be asked to host multiple fusion functions, and it is also possible that a node may decide not to host any fusion point but simply pass the incoming subtree on to one of its neighbors. In choosing delegates for the subtrees, the “resource richness” of the neighbors is taken into account. This recursive tree building ends at the data source nodes, i.e. the leaves of the task graph. The completion notification of the tree building phase recursively bubbles up the tree from the sources to the root. Bottom-up Approach. As with the top-down approach, this phase starts at the root node. However, the intent is to assign fusion points close to the sources due to the data contracting nature of the task graph. To understand how this works, suppose the input task graph is a tree with depth k. The leaf nodes are the data sources. Their parents are the fusion points that are k − 1 level distant from the root. For each fusion function at layer (k − 1) the root node asks an appropriate data source (commensurate with the task graph connectivity and avoiding duplication) to select the network nodes to host the set of fusion functions at level k − 1 that are its consumers. To select s fusion node from among its neighbors, a data source would prefer a node that is closer to the root node in terms of 2
Note that there could be multiple root nodes (one for each sink in the application task graph). This phase of the heuristic works in parallel starting from each such root.
32
hop count. The selected nodes at level k − 1 report their identity to the root node. Once all the fusion functions at level (k − 1) have been thusly mapped, the root node recursively maps the fusion functions at the next higher levels of the tree in a similar way. As with the top-down approach, relay nodes (as required) bridge the 1st level fusion points of the task graph to the root. For the bottom-up approach, the root node plays an active role in mapping the fusion point. The alternative would be to flood the complete task graph to the leaf nodes and other participating network nodes close to the leaves. Since the data sources can be quite far from the root such flooding can be quite expensive. Therefore, in the bottom-up approach the root node explicitly contacts a network node that is hosting a fusion point at a given level to map its parents to network nodes. This was not necessary for the top-down approach, where the root node needed to contact only its neighbor nodes for the mapping of the subtrees. 3.3.4.2
Optimization Phase
Upon completion of the initialization phase, the root node starts a recursive wave of “start optimization phase” message to the nodes of the network. This phase is intended to refine the initial mapping before the application starts. The input to this phase is the expected data flows between the fusion points and the application-specified cost function. During this phase, a node that is hosting a fusion point is responsible for either continuing to play that role or transferring it to one of its neighbors. The decision for role transfer is taken by a node (that hosts that role) based on local information. With a certain pre-determined periodicity a node hosting a fusion function informs its neighbors its role and its health (a metric determined by the application-specified cost function). Upon receiving such a message, the recipient computes its own health for hosting that role. If the receiving node determines that it is in better health to play that role, an intention to host message is sent to the sender. If the original sender receives one or more intention messages from its neighbors, the role is transferred to the neighbor with the best health. The overall health of the overlay network improves with every such role transfer. The optimization phase is time bound and is a tunable parameter of the role assignment
33
heuristic. Upon the expiration of the preset time bound, the root node starts a recursive wave of “end optimization phase” messages in the network. Each node is responsible in making sure that it is in a consistent state (for example, it is not in the middle of a role transfer to a neighbor) before propagating the wave down the network. Once this message reaches the sources this phase is over, and the application starts with data production by the sources. 3.3.4.3
Maintenance Phase
The maintenance phase has the same functionality as the optimization phase. The input to this phase are the actual data flows observed between the fusion points and the application-specified cost function. In this phase, nodes periodically exchange role and health information and trigger role transfers if necessary. The application dynamics and/or the network dynamics leads to such role transfers. Any such heuristic that relies entirely on local information is prone to getting caught in local minima (see Section 3.3.5). To reduce such occurrences, the maintenance phase periodically increases the size of the set of neighbors a node interacts with for potential role transfer opportunities. The overhead of the maintenance phase depends on the periodicity with which the neighborhood size is increased and the broadcast radius of the neighborhood. These two factors are exposed as tunable parameters to the application. 3.3.4.4
Support for Node Failure and Recovery
Failures can occur at any of the nodes of the sensor network: sources, sinks or fusion points. A sink failure is directly felt by the application and it is up to the semantics of the application whether it can continue despite such a failure. A source or intermediate fusion point failure is much more subtle. Such a failure affects the execution of the fusion functions hosted by the nodes downstream from the failed node that are expecting input from it. The manifestation of the original node failure is the unsuccessful execution of the fusion functions at these downstream nodes. As mentioned earlier, the fusion module has APIs to report fusion function failure to the application. The placement and fusion modules together allow recovery of the application from failure as detailed below.
34
After the first two phases of the role assignment heuristic, any node that is hosting a fusion point knows only the nodes that are consuming data from it. That is, any given node does not know the identity of the network nodes that are producing data for that node. In fact, due to the localized nature of the role transfer decisions, no single node in the network has complete knowledge of the physical deployment of the task graph nor even the complete task graph. This poses a challenge in terms of recovering from failure. Fortunately, the root node has full knowledge of the task graph that has been deployed. We describe how this knowledge is exploited in dealing with node failure and recovery. Basically, the root node plays the role of an arbiter to help resurrect a failed fusion point. Note that any data and state of the application that is lost at the failed node has to be reconstructed by the application. The recovery procedure explained below is to simply re-establish the complete task graph on the sensor network subsequent to a failure. Layer (k+1): n consumers Layer k: failed node Layer (k-1): m producers m subtrees Sources
Figure 9: An example failure scenario showing a task graph overlaid on the network. An edge in this figure may physically comprise of multiple network links. Every fusion point has only local information, v iz. identities of its immediate consumers. Recovery procedure. Let the failed fusion point be at level k of the task graph, with m input producer nodes at level (k − 1) to this fusion point, and n output consumer nodes at level (k + 1) awaiting output from this fusion point. The following three steps are involved in the recovery procedure: 1. Identifying the consumers: The n consumer nodes at level (k + 1) generate fusion function failure messages. These messages along with the IDs of the consumers are propagated through the network until they reach the root node.
35
2. Identifying the producers: Since there are m producers for the failed node, there are correspondingly m subtrees under the failed fusion function. The root identifies these subtrees and the data sources at the leaf of these subtrees, respectively, by parsing the application task graph. For each of these m subtrees, the root node selects one data source at the leaf level. The m selected data sources generate a probe message each (with information about the failed fusion function). These messages are propagated through the network until they reach the m nodes at level (k − 1). These m nodes (which are the producers of data for the failed fusion function) report their respective identities to the root node. 3. Replacing the failed fusion point: At this point the root node knows the physical identities of the consumers and producers to the failed fusion point. It requests one of them to choose a candidate neighbor node for hosting the failed fusion function. The chosen node informs the producers and consumers of the role it has assumed and starts the fusion function. This completes the recovery procedure. Needless to say, during this recovery process the consumer and producer nodes (once identified) do not attempt to do any role transfers. Also, this recovery procedure is not resilient to failures of the producers or the consumers (during recovery). 3.3.5
Analysis of the Role Assignment Heuristic
For the class of applications and the environments that the role assignment algorithm is targeted, the health of the overall mapping can be thought of as the sum of the health of individual nodes hosting the roles. The heuristic triggers a role transfer only if there is a relative health improvement. Thus, it is safe to say that such dynamic adaptations indeed improve the life of the network with respect to the cost function. The heuristic could occasionally result in the role assignment getting caught in a local minima. However, due to the dynamic nature of FWSN and the re-evaluation of the health of the nodes at regular intervals, such occurrences are short lived. For example, if ‘minimize transmission cost (MT1 or MT2)’ is chosen as the cost function, and if the network is caught in a local minima, that would imply that some node is losing energy faster than an optimal 36
node. Thus, one or more of the suboptimal nodes die causing the algorithm to adapt the assignment. This behavior is observed in real life as well and we show it in the evaluation section. The choice of the cost function has a direct effect on the behavior of the heuristic. We examine the behavior of the heuristic for a cost function that uses two simple metrics: (a) simple hop-count distance, and (b) fusion data expansion or contraction information. 100 0 00 10
1000 1000
100 0
1500
Fusion Point (A) Cost = 5500 bps
Source Fusion Point
Sink
Source Relay
1500
00 10
1500 Relay
Sink
(B) Cost = 5000 bps
Figure 10: Linear Optimization
1000
Relay 1 0 0 0
1000
1000
Sink
Fusion Point
00 10
1500
Fusion Source Point (A) Cost = 4500 bps
Source
1500
Sink
Relay (B) 3500 bps
Figure 11: Triangular Optimization The heuristic facilitates two types of optimizations: • Linear Optimization: If all the inputs to a fusion node are coming via a relay node (Figure 10A), and there is data contraction at the fusion point, then the relay node becomes the new fusion node, and the old fusion node transfers its responsibility to the new one (Figure 10B.) In this case, the fusion point is moving away from the sink, and coming closer to the data sources. Similarly, if the output of the fusion node is going to a relay node, and there is data expansion, once again the relay node acts as the new fusion node. In this case, the fusion point is coming closer to the sink and moving away from the data sources.
37
• Triangular Optimization: If there are multiple paths for inputs to reach a fusion point (Figure 11A), and if there is data contraction at the fusion node, then a triangular optimization takes place (Figure 11B) to bring the fusion point closer to the data sources; in the event of data expansion the fusion point moves towards the sinks. The original fusion point node becomes a relay node.
3.4
Implementation
Fusion API
Work Thread Prefetch Thread Module Module Buffers and Registers Stamepde Runtime System Figure 12: Fusion Module Components We use iPAQ running Linux as our conceptualization of a “future sensor node” in the implementation. DFuse is implemented as a multi-threaded runtime system, assuming infrastructure support for reliable timestamped data transport through the sensor network. The infrastructural requirements are met by a programming system called Stampede [62, 2]. A Stampede program consists of a dynamic collection of threads communicating timestamped data items through channels. Stampede also provides registers with full/empty synchronization semantics for inter-thread signaling and event notification. The threads, channels, and registers can be launched anywhere in the distributed system, and the runtime system takes care of automatically garbage collecting the space associated with obsolete items from the channels.
38
3.4.1
Fusion Module
The fusion module consists of the shaded components shown in Figure 12. It is implemented in C as a layer on top of the Stampede runtime system. All the buffers (input buffers, fusion buffer, and prefetch buffer) are implemented as Stampede channels. Since Stampede channels hold timestamped items, it is a straightforward mapping of the fusion attribute to the timestamp associated with a channel item. The Status and Command registers of the fusion architecture are implemented using the Stampede register abstraction. In addition to these Stampede channels and registers that have a direct relationship to the elements of the fusion architecture, the implementation uses additional Stampede channels and threads. For instance, there are prefetch threads that gather items from the input buffers, fuse them, and place them in the prefetch buffer for potential future requests. This feature allows latency hiding but comes at the cost of potentially wasted network bandwidth, and hence energy (if the fused item is never used). Although this feature can be turned off, we leave it on in our evaluation and ensure that no such wasteful communication occurs. Similarly, there is a Stampede channel that stores requests that are currently being processed by the fusion architecture to eliminate duplication of work. The createFC call from an application thread results in the creation of all the above Stampede abstractions in the address space where the creating thread resides. An application can create any number of fusion channels (modulo system limits) in any of the nodes of the distributed system. An attachFC call from an application thread results in the application thread being connected to the specified fusion channel for getting fused data items. For efficient implementation of the getFCItem call, a pool of worker threads is created in each node of the distributed system at application startup. These worker threads are used to satisfy getFCItem requests for fusion channels created at this node. Since data may have to be fetched from a number of input buffers to satisfy the getFCItem request, one worker thread is assigned to each input buffer to increase the parallelism for fetching the data items. Once fetching is complete, the worker thread rejoins the pool of free threads. The worker thread to fetch the last of the requisite input items invokes the fusion function
39
and puts the resulting fused item in the fusion buffer. This implementation is performanceconscious in two ways: first, there is no duplication of fusion work for the same fused item from multiple requesters; second, fusion work itself is parallelized at each node through the worker threads. The duration to wait on an input buffer for a data item to be available is specified via a policy flag to the getFCItem. For example, if try for time delta policy is specified, then the worker thread will wait for time delta on the input buffer. On the other hand, if block policy is specified, the worker thread will wait on the input buffer until the data item is available. The implementation also supports partial fusion in case some of the worker threads return with an error code during fetch of an item. Taking care of failures through partial fusion is a very crucial component of the module since failures and delays can be common in FWSN. As we mentioned earlier, Stampede does automatic reclamation of storage space of data items in channels. Stampede garbage collection uses a global lower bound for timestamp values of interest to any of the application threads (which is derived from a per-thread state variable called thread virtual time). Our fusion architecture implementation leverages this feature for cleaning up the storage space in its internal data structures (which are built using Stampede abstractions). 3.4.2
Placement Module
Stampede’s runtime system sits on top of a reliable UDP layer in Linux. Therefore, there is no support for adaptive multi-hop ad hoc routing in the current implementation. Further, there is no support for gathering real health information of the nodes. For the purposes of evaluation, we have adopted a novel combined implementation/simulation approach. The fusion module is a real implementation on a farm of iPAQs as detailed in the previous subsection. The placement module is an event-driven simulation of the three phases of the role assignment algorithm described in Section 3.3. It takes an application task graph and the network topology information as inputs, and generates an overlay network, wherein each node in the overlay is assigned a unique role of performing a fusion operation. It models the health of the sensor network nodes. It currently assumes an ideal routing layer
40
(every node knows a route to every other node) and an ideal MAC layer (no contention). Further, path reliability for MT1 and MT2 cost function evaluation is assumed to be 1 in the simulation. However, it should be clear that these assumptions are not inherent in the DFuse architecture; they are made only to simplify the simulator implementation and evaluation. We have implemented an interface between the the fusion module implementation and the placement module simulation. This interface facilitates (a) collecting the actual data rates of the sensor nodes experienced by the application running on the implementation and reporting them to the placement module simulation, and (b) communicating to the fusion module (and effecting through the DFuse APIs) dynamic task graph instantiation, role changes based on the health of the nodes, and fusion channel migration.
3.5
Evaluation
We have performed an evaluation of the placement modules of the DFuse architecture to quantify its ability to optimize the network given a cost function. The experimental setup uses a set of wireless iPAQ 3870s running Linux “familiar” distribution version 0.6.1 together with a prototype implementation of the fusion module discussed in section 3.4.1. 3.5.1
Application Level Measurements of DFuse 2
5
8
11 Src
1
4
7
10 Src
0
3
6
Sink
9 Src
Figure 13: iPAQ Farm Experiment Setup. An arrow represents that two iPAQs are mutually reachable in one hop. We have implemented the fusion application (Figure 1) using the fusion API and deployed it on the iPAQ farm. Figure 13 shows the topological view of the iPAQ farm used for the fusion application deployment. It consists of twelve iPAQ 3870s configured identically 41
to those in the measurements above. Node 0, where node i is the iPAQ corresponding to ith node of the grid, acts as the sink node. Nodes 9, 10, and 11 are the iPAQs acting as the data sources. The location of filter and collage fusion points are guided by the placement module. We have tuned the fusion application to generate data at consistent rates as shown in Figure 1, with x equal to 6 KBytes per minute. This is equivalent to a scenario where cameras scan the environment once every minute, and produce images ranging in size from 6 to 12 KBytes after compression. The placement module simulator runs on a separate desktop in synchrony with the fusion module. At regular intervals, it collects the transmission details (number of bytes exchanged between different nodes) from the farm. It uses a simple power model (see Section 3.5.1.1) to account for the communication cost and to monitor the power level of different nodes. If the placement module decides to transfer a fusion point to another node, it invokes the moveFC API to effect the role transfer. MT2 MPV
3.5E+06 3.0E+06 2.5E+06
MTP
4E+06
3E+06
3E+06
3E+06
2E+06
2E+06
1E+06
8E+05
3E+05
420
2.0E+06 1.5E+06 1.0E+06 5.0E+05 0.0E+00 1620
Power Variance
4.5E+06 4.0E+06
100 90 80 70 60 50 40 30 20 10 0
MT2 MPV MTP
Run Time (normalized)
Remaining Capacity (%)
Time (ms)
(A)
Number of Role Transfers (absolute)
(B)
Figure 14: Comparison of different cost functions. Application runtime is normalized to the best case (MT2), and total remaining power is presented as the percentage of the initial power.
3.5.1.1
Power Model
The network is organized as the grid shown in Figure 13. For any two nodes, the routing module returns the path with a minimum number of hops across powered nodes. To account for power usage at different nodes, the placement module uses a simple approach. It models the power level at every node, adjusting them based on the amount of data a node
42
transmits or receives. The power consumption corresponds to ORiNOCO 802.11b PC card specification [54]. Our current power model only includes network communication costs. After finding an initial mapping (naive tree), the placement algorithm runs in optimization phase for two seconds. The length of this period is tunable and it influences the quality of mapping at the end of the optimization phase. During this phase, fusion nodes wake up every 100ms to determine if role transfer is warranted by the cost function. After optimization, the algorithm runs in maintenance phase until the network becomes partitioned (connectivity can no longer be supported among all the fusion points of the task graph). During the maintenance phase, role transfer decisions are evaluated every 50 seconds. The role transfers are invoked only when the health improves by a threshold of 5%.
NetworkTraffic (Bytes/Second)
Experimental Results
3500 Actual Placement
Best Placement
3000 2500 2000 1500 1000 500
3500 3000
Best Placement
2000 1500 1000 500 0 4. 2E + 1 . 02 2E +0 3 1. 0E + 4. 05 5E + 8. 05 0E +0 1. 5 2E + 1. 06 5E + 1 . 06 9E +0 6 2. 2E + 2. 06 6E + 2 . 06 9E +0 6 3. 3E + 3. 06 6E + 4. 06 0E +0 6
4.
Time (ms)
Time (ms)
(B) MPV: Minimize Power Variance
NetworkTraffic (Bytes/Second)
(A) MT1: Minimize Transmission Cost - 1
NetworkTraffic (Bytes/Second)
Actual Placement
2500
0 2E + 1 . 02 2E +0 3 1. 0E + 4. 05 5E + 8. 05 0E + 1. 05 2E +0 1. 6 5E + 1. 0 6 9E + 2. 06 2E +0 6 2. 6E + 2. 06 9E + 3 . 06 3E + 3. 0 6 6E +0 6 4. 0E +0 6
NetworkTraffic (Bytes/Second)
3.5.1.2
3500 Actual Placement
Best Placement
3000 2500 2000 1500 1000 500
3500 Actual Placement
Best Placement
3000 2500 2000 1500 1000 500 0
4.
4. 2E + 1. 0 2 2E +0 3 1. 0E + 4. 05 5E + 8. 05 0E + 1 . 05 2E +0 6 1. 5E + 1. 0 6 9E + 2. 06 2E + 2. 06 6E +0 2. 6 9E + 3 . 06 3E + 3. 0 6 6E +0 6 4. 0E +0 6
2E + 1 . 02 2E +0 3 1. 0E + 4. 05 5E + 8. 05 0E +0 5 1. 2E + 1. 06 5E + 1. 06 9E + 2. 06 2E + 2. 06 6E +0 2. 6 9E + 3. 06 3E + 3 . 06 6E +0 6 4. 0E +0 6
0
Time (ms)
Time (ms)
(C) MTP: Ratio of Transmission Cost to Available Power
(D) MT2: Minimize Transmission Cost - 2
Figure 15: The network traffic timeline for different cost functions. X axis shows the application runtime and Y axis shows the total amount of data transmission per unit time.
43
Figure 15 shows the network traffic per unit time (sum of the transmission rate of every network node) for the cost functions discussed in Section 3.3.2. It compares the network traffic for the actual placement with respect to the best possible placement of the fusion points (best possible placement is found by comparing the transmission cost for all possible placements). Note that the application runtime can be increased by simply increasing the initial power level of the network nodes. In MT1, the algorithm finds a locally best placement by the end of the optimization phase itself. The optimized placement is only 10% worse than the best placement. The same placement continues to run the application until one of the fusion points (one with the highest transmission rate) dies, i.e. the remaining capacity becomes less than 5% of the initial capacity. If we do not allow role migration, the application will stop at this time. But allowing role migration, as in MT2, enables the migrating fusion point to keep utilizing the power of the available network nodes in the locally best possible way. Results show that MT2 provides maximum application runtime, with an 110% increase compared to that for MT1. The observed network traffic is at most 12% worse than the best possible for the first half of the run, and it is the same as the best possible rate for the latter half of the run. MPV performs the worst while MTP has comparable network lifetime as MT2. Figure 15 also shows that running the optimization phase before instantiating the application improves the total transmission rate by 34% compared to the initial naive placement. Though MPV does not provide comparably good network lifetime (Figure 15B), it provides the best (least) power variance compared to other cost functions (Figure 14A). Since MT1 and MT2 drain the power of fusion nodes completely before role migration, they show worst power variance. Also, the number of role migrations is minimum compared to other cost functions (Figure 14B). These results show that the choice of the cost function depends on the application context and the network condition. If, for an application, role transfer is complex and expensive but network power variance is not an issue, then MT2 is preferred. However, if network power variance needs to be minimized and role transfer is inexpensive, MTP is preferred.
44
Transmission Rate (Bytes/Sec)
TopDown Initialization
30000
BottomUp Initialization
25000 20000 15000 10000 5000 0 0
10
20
30
40
50
60
70
80
90 100
Contraction Ratio (%)
Figure 16: Effect of data contraction on initial mapping quality. Network grid size is 32X32.
Control Overhead (Number of messages)
1000
TopDown Initialization
800
BottomUp Initialization
600 400 200 0 0
10
20
30
40
Task Graph Size (Number of fusion functions)
Figure 17: Effect of input task graph size upon total transmission cost of initial mapping. Network grid size is 32X32. 3.5.2
Simulation-based Study of Large Networks and Applications
To evaluate DFuse’s scalability, we employ MSSN, an event driven simulator of the DFuse fusion channel and placement module. We report on three studies performed with this simulator: (a) control overhead of initialization phase algorithms, (b) performance of the role assignment algorithm for a small application on a large network, and (c) scalability of the role assignment heuristic for a large application.
45
Transmission Ratio (Bytes/Sec)
TopDown Initialization
100000 90000 80000 70000 60000 50000 40000 30000 20000 10000 0
BottomUp Initialization
0
10
20
30
40
Task Graph Size (Number of fusion functions)
Control Overhead (Number of messages)
Figure 18: Effect of input task graph size upon control overhead of initialization algorithms. Network grid size is 32X32.
10000
TopDown Initialization
8000
BottomUp Initialization
6000 4000 2000 0 0
200
400
600
800
1000
1200
Network Size (Total number of nodes in the grid)
Figure 19: Effect of network size upon control overhead of the initialization algorithms. Average distance of the sink from the sources is kept in proportion to the grid width.
46
3.5.2.1
Initialization phase algorithms
We have simulated the two initialization algorithms to study their control overhead properties and initialization quality. The control overhead relates directly with the time and energy consumed during the initialization phase. The variables are the network graph size, input task graph size, and the average distance from sink to the sources. Results show that the control overhead, in terms of the total number of control messages, depends mainly on the distance between the sink and the data sources, and it varies only a little with the size of the task graph. Also, Figures 16 and 17 show that compared to top-down algorithm, bottom-up approach leads to much better initial mapping in terms of total transmission cost. Even when the task graph has considerable number of of data expansion fusion points (i.e. contraction ratio less than 50% in Figure 16), bottom-up algorithm gives better initial mapping than the top-down algorithm. Figures 18 and 19 show that the control overhead (number of control messages) for the two initialization algorithms increase linearly with the increase in task graph size or with the increase in sink to data source distance. Among the two algorithms, bottom-up algorithm has lower message overhead.
3.6
Related Work
Data fusion, or in-network aggregation, is a well-known technique in sensor networks. Research experiments have shown that it saves considerable amount of power even for simple fusion functions like finding min, max or average reading of sensors [47, 30]. While these experiments and others have motivated the need for a good role assignment approach, they do not use a dynamic heuristic for the role assignment and their static role assignment approach is not amenable to streaming media applications. Using a dataflow (task) graph to model distributed applications is a common technique in streaming databases, computer vision, and robotics [13, 74, 64]. DFuse employs a script based interface to express task graph applications over the network similar to SensorWare [6]. SensorWare is a framework for programming sensor networks, but its features are orthogonal to what DFuse provides. Specifically, 1) SensorWare does not employ any strategy for assigning roles to minimize the transmission cost, or dynamically adapt the
47
role assignment based on available resources. It leaves the onus to the applications. 2) Since DFuse focuses on providing support for fusion in the network, the interface to write fusion-based applications using DFuse is quite simple compared to writing such applications in SensorWare. 3) DFuse provides optimizations like prefetching and support for buffer management which are not yet supported by other frameworks. Other approaches, like TAG [47], look at a sensor network as a distributed database and provide a query-based interface for programming the network. TAG uses an SQL-like query language and provides in-network aggregation support for simple classes of fusion functions. TAG assumes a static mapping of roles to the network, i.e. a routing tree is built based on the network topology and the query on hand. Database researchers have recently examined techniques for optimizing continuous queries over streaming data. In many cases, the systems are centralized, and all of the processing occurs on one machine [12, 51, 10]. The need to schedule operators over distributed, heterogeneous, resource constrained devices presents fundamentally different challenges. More recently, researchers have examined distributed stream processing, where computations are pushed into the network much as in DFuse [3, 60]. Unlike DFuse, these systems usually assume powerful in-network computation nodes, and attempt to minimize individual metrics like network usage and maximize throughput. For example, SBON [60] expects every node to run Vivaldi algorithm and generate a latency cost estimate of the whole network to facilitate operator placement. DFuse allows a variety of optimization functions so that the processing can be carried out on lower-capability, energy-constrained devices. Moreover, a common technique in data stream systems is to reorder filter and join (e.g. fusion) operations, according to the rules of relational algebra, in order to optimize network placement [68]. Similarly, systems such as Aurora* [13] allow “box sliding” (e.g., moving operators) and “box splitting” (e.g., splitting computation between two copies of an operator), but assume that the semantics of the operator are known in order to determine which reconfigurations are allowable. Our system is meant to deal with task graphs where the computations are complex black boxes, and where operations often cannot be reordered. Some work has looked at in-network processing of streams using resource constrained
48
devices [47] but have focused primarily on data aggregation. Our work also addresses data fusion, which utilizes more complex computations on specific streams and requires novel role assignment and migration techniques. Java object migration in MagnetOS [42] is similar in spirit to the role migration in DFuse, but MagnetOS does not allow an application to specify its own cost function to influence the migration behavior. Other areas where distributed tree optimization has been looked at include distributed asynchronous multicast [5], and hierarchical cache tree optimization on the Internet [70]. However, these optimizations are done only with respect to the physical network properties, and they do not need to consider any input task graph characteristics (fusion function dependencies, black-box treatment, etc) and sensor network requirements (dynamism, energy). One recent work in composing streaming applications over peer-to-peer environments [23] has looked at an optimization problem of our flavor, but the proposed algorithm adapts a mapping only in lieu of node failures, thus it is not suitable for FWSN environments that need much more dynamic adaptations. Recent research in power-aware routing for mobile ad hoc networks [67, 32] proposes power-aware metrics for determining routes in wireless ad hoc networks. We use similar metrics to formulate different cost functions for our DFuse placement module. While designing a power-aware routing protocol is not the focus of this thesis, we are looking into how the routing protocol information can be used to define more flexible cost functions.
3.7
Summary
This chapter explores the issues involved in supporting network-level adaptability in SensorStack. In the fusion layer, we present a framework for mapping fusion applications such as distributed surveillance on wireless ad hoc sensor networks. The proposed framework eases the development of complex fusion applications for future sensor networks. Our framework uses a novel distributed role assignment algorithm that increases application runtime by doing power-aware, dynamic role assignment. We validate our framework by designing a sample application and evaluating the application on an iPAQ based sensor network testbed. By using different cost functions, we show that the proposed role assignment algorithm can
49
adapt to network changes in an application-specific manner. To evaluate the cost function periodically, the fusion layer currently needs to use information available at other layers and at neighboring nodes. Without any explicit support for cross-layering, we need to understand other modules’ interface (if available) to use their exposed information, and also we need to implement data collection logic to gather data from neighboring nodes. This leads to an increase in the data dependency among different protocol modules, and also an increase in communication overhead incurred in gathering information from the neighboring nodes. In the next chapter, we investigate how a simple service of information exchange across different modules can reduce inter-module data dependencies, and also reduce the communication overhead.
50
CHAPTER 4
NODE LEVEL ADAPTABILITY
Wireless Sensor Networks are deployed in demanding environments, where application requirements as well as network conditions may change dynamically. Thus the protocol stack in each node of the sensor network has to be able to adapt to these changing conditions. Historically, protocol stacks have been designed with strict layering and strong interface between the layers leading to a robust design. However, cross-layer information sharing could help the protocol modules to make informed decisions and adapt to changing environmental conditions. There have been ad hoc approaches to facilitating cross-layer cooperation for adaptability. However, there has been no concerted effort at providing a uniform framework for cross-layer adaptability that preserves the modularity of a conventional protocol stack. This chapter presents a novel service, information exchange service (IES), as a framework for cross-module information exchange. IES is a centrally controlled bulletin-board where different modules can post available data, or request for useful information, and get notified when the information becomes available. IES is integrated into the proposed SensorStack architecture that preserves the benefits of layering while facilitating adaptability. IES has been implemented in TinyOS and Linux, to show both the feasibility of the design as well as demonstrate the utility of cross-layering to increase application longevity.
4.1
Introduction
The explosive growth of the Internet has been spurred to a great extent by the modularity of the network protocol stack influenced by the OSI model. Adherence to the strict interfaces in the different layers, has enabled the independent development of robust protocols and their validation. While the focus on modularity (in the OSI model) has been a useful design guideline for Internet protocols, it is becoming clear that the decisions taken at runtime in
51
the different layers could be better optimized with cross layer information. This is particularly true in dynamic settings when the network conditions can change quite dramatically. For example, researchers have shown the utility of explicit congestion notification from the routers to the transport layer [63], and link status information to the IP layer in a wireless setting [87]. While modularity is a key to protocol development and deployment, adaptability is emerging as a key determinant of performance, especially in a wireless setting. The design decisions in the protocol stack have to adapt to changing network conditions to maintain high performance. Such adaptability would be facilitated by the use of information available in different layers. Wireless Sensor Networks (WSN) amplify the need for sharing crosslayer information even further. In addition to the vagaries of the wireless network itself, the inherent resource constrained nature of the nodes pose additional challenges for the protocol stack. Nodes may join or leave the network to save their individual battery power, or environment conditions may vary, thus resulting in dynamic changes to the network topology. To allow for adaptability in the face of such dynamism, many WSN protocols have proposed piecemeal use of cross-layer information. For example, information from link layer may be used by the routing layer, and routing table information may be used by the application layer. However, it is difficult to foresee all the adaptation needs. Hence it is a challenge to standardize protocol interfaces that expose all useful cross-layer information. Optimizing energy, the single most important resource for WSN nodes, requires a holistic view of the stack instead of a layer-specific view available with such piecemeal solutions. It is interesting to note that in spite of the increasing importance of cross-layering, it is still viewed with skepticism by the system community [35]. There are good reasons for this skepticism. Without careful system support, cross-layering may result in minimal benefits, may be misused, and may lead to unintended problems in the long run. There are three main reasons that point to the need for a careful design of cross-layering. First, without standard interfaces for information sharing, cross-layering could lead to inefficiencies. Often different modules may collect the same information to adapt their behavior, leading to wastage of computation, memory, and energy resources. For example, neighborhood
52
information is useful for both network level routing and application level role assignment; hence uncoordinated information gathering will result in significant resource wastage (see Table I). Second, piecemeal evolution of cross-layering would lead to a spaghetti design of the protocol stack that is hard to maintain and verify due to the complex interactions among the different modules. Third, without a holistic approach to information sharing and event notification different protocol modules may make sub-optimal decisions leading to poor adaptability. For example, unless the application layer is notified of a sudden change in a link quality by the network layer, its role assignment decisions will be sub-optimal thus affecting application longevity. The question being addressed in this chapter is the following: How can we facilitate holistic adaptability without losing modularity? The main issue boils down to overcoming the inherent tension between adaptability and modularity: adaptability needs cross-layer information that seems difficult to obtain without affecting modularity. In other words, how can we structure cross-layer information sharing that does not compromise the robustness and maintainability of the protocol stack? This problem can be solved by decoupling the adaptability needs (that are cross-layer data oriented) from the modularity needs (that are functionality oriented). We use this intuition of decoupling cross-layer data from functionality to achieve an adaptable and modular protocol stack called SensorStack. At the heart of this stack is a novel Information Exchange Service (IES) that is available to all the layers. Through a publish/subscribe interface, IES provides a predicate-based event notification service that can be used by the protocol modules for information sharing and for making adaptive decisions. By absorbing the onus of managing the cross-layer data for adaptability, IES allows the protocol modules to focus on the functionalities to preserve modularity. We have implemented IES in TinyOS [27], and assembled a representative SensorStack using heterogeneous sensor network (HSN) routing layer from shareware [31] and an application level data fusion layer called DFuse [36]. Through the implementation and evaluation we demonstrate the utility of SensorStack with IES both qualitatively and quantitatively. First, there is a qualitative benefit in that the component diagram of SensorStack with IES is simpler, with less interaction among the protocol modules for accessing cross-layer
53
data. From a software engineering perspective, this design lends itself to maintainability and robustness of the protocol stack. Second, we show through micro-measurements that the code-path overhead of using IES to access cross-layer information is minimal. Third, we show that resource wastage (network, memory, and CPU) is minimized by aggregating the collection of neighborhood information that is shared by all the layers via IES. This chapter highlights several contributions: 1. By decoupling cross-layer information gathering and sharing from layer functionality, we facilitate adaptability without sacrificing modularity. The design and evaluation of IES is the primary contribution. There are two main nuggets in the design of IES: • Data management module provides a declarative publish/subscribe interface for protocols to share information facilitating a modular design. Further, it takes care of efficient use of the available limited node memory for information representation, eviction, and access. • Event management module provides a condition-based event notification mechanism to alert protocol modules of any changes in the environment thus facilitating adaptability. 2. Representative implementations of SensorStack with IES on TinyOS and Linux showing feasibility of the IES design to promote modularity and adaptability. 3. A simple taxonomy for cross-layer information sharing that provides transparency without affecting modularity. The rest of the chapter is organized as follows. Section 4.2 motivates the need for cross-layer system support with an example application. Section 4.3 proposes a taxonomy for sharable information in the SensorStack. IES design is presented in Section 4.4. The implementation and evaluation of IES are presented in Sections 4.5 and 4.6, respectively. Related work is discussed in Section 4.7. Section 4.8 concludes the chapter with summary and future work.
54
4.2
Motivation
In this section, we present an application scenario to motivate the need for cross-layering. Figure 1 shows the task graph for this example application, in which filter and collage are fusion functions that transform input streams into output streams in an application specified manner. Fusion Application
Application
GPSR
Routing Radio Control
ASCENT
Medium Access Control
MAC
Figure 20: An example protocol stack configured to run the fusion application. Consider deploying the fusion application shown in Figure 1 on a FWSN. In the resulting overlay, the sources and sinks fixed; two intermediate nodes perform the two fusion functions, respectively, while other nodes simply serve as “relays”. Let us compose a stack that runs on each node of the WSN (Figure 20). The application layer runs the fusion code (if this node is not simply a relay node). The other layers of the stack perform routing, radio control (for energy conservation), and media access, respectively. While any protocol may be chosen for each of these layers, we will make the discussion concrete by choosing: greedy perimeter stateless routing (GPSR) [34] for data routing; ASCENT [9] for controlling the radio duty cycle; and DFuse [36] as the application. Table 2: Impact of sharing Transmission cost Without sharing 1920 bps With sharing 640 bps
neighborhood information Code footprint Memory access latency ˜ 22 K Bytes 530.1 mSec 7.5 K Bytes 42.2 mSec
With this simple stack, we can examine the opportunities that exist for cross-layer information sharing:
55
• Information collection and storage: Different modules independently collect and maintain same state information. For example, GPSR maintains neighbor nodelist for choosing an appropriate next hop; ASCENT maintains the neighbor nodelist to decide on state transition for the radio (active, passive, or sleep); and DFuse maintains neighborhood information for its role assignment decisions. Table 4.2 summarizes the impact on resource usage of sharing neighborhood information among the modules. For a network with an average neighborhood size of 10 nodes each requiring an 8byte state information, every periodic node update consumes a network bandwidth of 640 bps. For the example stack (with three modules gathering this information periodically) the total communication overhead is 1920 bps for every period of data collection. Clearly, if the data collection is done once and shared with the other modules, the periodic network bandwidth requirement will just be 640 bps per node. Similar saving can be realized for memory usage on the node as well. For example, the ROM footprint of neighborhood interface implementation in heterogeneous sensor network (HSN) routing (from TinyOS 1.1.10 snapshot on sourceforge) is about 7.5 KB. • Application-aware routing: Application requirements often change dynamically with change in the user query (fusion application task graph), or with change in network conditions. In response to such changes, the application may want to change its routing behavior. For example, if a fusion function is communication-intensive, then a node that runs it may already be overloaded, and should not be considered for data forwarding by the routing module. GPSR cannot adapt to such changes in the application requirements without information sharing across the layers. A need for application-aware routing has been observed before in many other scenarios [46, 25]. • Radio duty cycle control: Controlling the duty cycle of the radio is important for conserving node energy in sensor networks. One technique is to avoid wasteful listening to increase the network lifetime [86, 11, 9]. For effective duty-cycle control, the responsible module (ASCENT in the example stack) must be aware of the node’s
56
dynamic communication requirements. Unfortunately, this knowledge is dispersed in several other layers. For example, even though the routing layer may be idle, the application layer may be waiting on incoming traffic to carry out a fusion operation. Thus the controller cannot control the duty cycle simply by considering the routing module requirement. This motivating example serves as a springboard for designing an information exchange service across the layers of the protocol stack.
4.3
Organization and Information Taxonomy
Table 3: Cross-layer Information Produced by Different Protocol Layers Protocol Layer Sample Implementa- Produced information Consumed information tions Application DFuse [36], Surge, Resource requirement, Resource availability, TAG [47] Sensed data, Trans- Neighborhood, Topolmission schedule ogy Routing Directed diffusion [30], Routing metric values, Neighborhood, AppliGPSR [34], SPEED topology information cation requirement [24], TAG tree routing Medium access SMAC [86], Z-MAC, Duty cycle, Neighbor- Application requirecontrol, Duty T-MAC [72], ASCENT hood information ment, Link informacycle control [9], SPAN [11] tion Link layer SP (sensornetwork Link status Physical condition protocol) [61] It is clear from the previous section that decisions in the different layers of the protocol architecture can benefit from cross-layer information sharing. To this end, we first identify the different cross-layer information. Table 3 presents a snapshot of such information commensurate with the functionality provided by a particular layer. For example, the link layer (such as SP [61]) uses the physical condition of the environment as input to produce “link status” information as output that may be useful to other modules. This table is not meant to be exhaustive, but simply serves as a boiler plate for the taxonomy to be presented in this section. One way to facilitate efficient decision making in each layer is to query the other layers for relevant information. Figure 21 depicts the potential information exchange for the sample 57
1. Fusion Requirement 2. Data transmission requirement 3. Neighborhood 4. Data transmission requirement 5. Data transmission requirement 6. Neighborhood, Topology 7. Time synchronization accuracy requirement 8. Data transmission requirement 9. Role schedule, Duty cycle information
Role Assignment 9
3
8
Application, Data Fusion
1 4 5
2
Flood Routing
HSN Routing 6
MAC
7
Time Sync Service
Figure 21: Cross-layer information exchange architecture in Figure 20. Clearly direct querying of peer modules will result in breaking the modularity of the protocol architecture and lead to an unstructured and hard to maintain code base. The fundamental challenge is in developing a layered software architecture that preserves the modularity while allowing cross-layer information sharing. This raises several important research issues: 1. Organization: How do we organize the layered software architecture? One promising approach is to decouple the data needed for such information exchange from the functionality of the layered architecture. 2. Taxonomy: How do we develop a useful taxonomy for the kinds of information that will be needed by the different layers? 3. Information Sharing: How do we facilitate information sharing across the layers that is efficient and non-intrusive on the functionality provided by each layer?
4.3.1
Organization and Information Sharing
We presented a layered software architecture, called SensorStack (see Figure 22), in Chapter 2. At the heart of the SensorStack is Information Exchange Service (IES) that serves as an information broker among the different modules of the layered architecture and the application to facilitate cross-layer optimizations. Our approach is to decouple the data needed for such information exchange from the functionality of the stack. To this end, 58
\ Logic Application
Application Layer Data Fusion Layer
In-stack fusion Information Exchange Service
Data Service Layer
Helper Service Layer
Data dissemination, Logical naming, Packet scatter/gather
AttributeValue publish/ subscribe
Medium Access, Error Control, Radio Contro l
Medium Access Layer Radio Layer
Localization, Synchronization Service
Connection
(A) Stack Lay-out
(B) Functionalities
Figure 22: SensorStack: A proposed FWSN stack (reproduced from Chapter 2). we first identify the different cross-layer data and develop a taxonomy for grouping them. Table 3 presents a snapshot of such data commensurate with the functionality provided by a particular layer. For example, the link layer (such as SP) uses the physical condition of the environment as input to produce “link status” information as output that may be useful to other modules. This table is not meant to be exhaustive, but simply serves as a boiler plate for the taxonomy being presented in this section. 4.3.2
Taxonomy
For the purpose of extensibility and documentation, we represent the attributes in the taxonomy in XML format. Clearly, it will be too inefficient to access information across layers by parsing the XML representation of each attribute. Rather, every attribute in the taxonomy is given a unique identifier known to all the layers, and the identifier is used to refer an attribute, thus avoiding the need of the XML parsing. We discuss the assignment of unique identifiers in Section 4.4. The information produced and consumed by each layer to facilitate cross-layering can be grouped into four broad categories: local resources, neighborhood, application requirements, and wildcard. Examples in XML format for each of these categories are shown in Table 4. 1. Local resources: The application layer working in concert with the system monitoring module may produce information about the available node resources. Important resources to identify include details regarding available energy, CPU, memory, radio, and sensors. Table 4 shows an example in this category for a battery resource. 59
Table 4: Example XML descriptions for a taxonomy of cross layer information Local resources
Neighborhood 12 17 22 1 12100 12110 1
1 10 12000 ...
Application Requirements
Wildcard
DFuse 101 64 300 0 1 102 1024 1 1 12110
1 0 4 64 5 64
2. Application requirements: An application may produce information that would be of use in the decision making at the routing and MAC layers. As an example, Table 4 shows an application’s data transmission requirements. 3. Neighborhood: For scalability and load balancing reasons, FWSN protocols take many decisions locally, and information about neighboring nodes play a very important role. Link layer protocols can produce link qualities of the neighboring nodes. Routing layer can collect routing metric based information, e.g., energy, location, and availability. Table 4 shows an example entry for a node in the neighborhood. Using this information, a link layer protocol can use the timeOn and the timeOff fields to minimize idle listening. The listen attribute can be used to inform the link layer to expect transmission from a neighbor, and it can be used for bi-directional low power communication [61]. 4. Wildcard: There may be other information produced by a particular protocol layer 60
that may not fall into the categories we have identified so far. Examples include abstract region specification for node cooperation [78], area abstraction in SPEED for multicast groups [24], path abstraction for energy-aware routing [88], and role abstraction for load balancing [36, 18]. We group them as wildcard in our taxonomy. Table 4 shows an example wildcard entry (the specific entry shown describes the role played by a node in a DFuse application).
4.4
Information Exchange Service Design
IES is an information repository for data that helps in cross-layer optimization. Such data may come from one of the modules of the SensorStack or even from the application itself. The taxonomy presented in Section 4.3 allows grouping the data into different categories irrespective of where it came from and enables easy access by a requesting module. Also, by centralizing all the information in this repository, SensorStack exercises control over access/update rights in a centralized manner. 4.4.1
Design Goals
First we enumerate the design goals of IES. 1. Efficient use of limited memory: Memory is a scarce resource in embedded devices, therefore it has to be used prudently. It is quite easy to populate the repository with any and all information that may be useful for cross-layer optimization. However, only a fraction of this information may actually get used by the layers for adaptability. Therefore, IES must filter out unnecessary information and allow the protocol modules to share the memory efficiently. 2. Simple interface for information sharing: To ensure that the SensorStack remains modular, cross-module information sharing for adaptability should not lead to coupling of functionalities across layers. Towards this requirement, modules should be able to share data without concerns of synchronization and consistency. Further, the information access should be transparent to producers and consumers (i.e., producers
61
do not know who the consumers are and vice versa). Therefore, IES should provide a simple interface that allows the modules to be implemented independently and efficiently. 3. Extensibility: IES should facilitate new information that is outside its repertoire of taxonomy to be added without any change in either the interface or in the underlying architecture. For example, if a new routing protocol is added to the stack, it should be able to publish any new metric into IES that may not be currently in the taxonomy. 4. Asynchronous access to information: Producers and consumers of data should not be burdened with unnecessary work. This goal translates to IES providing an asynchronous interface for information from publishers to be pulled into the repository, or information to consumers to be pushed from the repository, obviating the need for polling on the part of the producers and consumers. 5. Complex event notification: To make SensorStack adaptable, protocol modules should be notified of changes reactively. This goal translates to protocol modules being able to register events of interest (which may be a composite of several attributes) with the IES, and receive asynchronous notification when the condition becomes true. The main objective of IES is to ensure that the SensorStack remains modular (goals 1-3) while supporting adaptability (goals 4 and 5). Access control, security, and protection are also important for IES, but they are outside the scope of this thesis. 4.4.2
IES Architecture
As shown in Figure 23, IES comprises of two main components: Data Management Module (DMM) and Event Management Module (EMM). DMM is responsible for helping achieve modularity, while EMM is responsible for helping achieve adaptability. DMM is designed as a shared memory abstraction augmented with a fully-associative cache for efficient access; it offers a publish-subscribe interface for sharing information across layers. EMM is designed as a rule-based event notification engine such that protocol modules can be notified as requested, allowing them to adapt to the changes in the environment. 62
Subscriber List
Publisher List Shared Memory
Date Publisher
put
get
Data Subscriber
DAE
DRE
Cache
DMM
periodicGet
DAE
EMM Rule List
Rule Execution Engine
Subscriber List
addRule
Event Subscriber
RSE
Figure 23: IES architecture. Top half of the diagram shows the Data Management Module (DMM), while the bottom half shows the Event Management Module (EMM). Note that EMM acts as a subscriber to DMM component. Figure 24 shows the nesC [21] representation of IES interfaces. DMM controls access to the data repository, and thus provides the publisher and subscriber interfaces. DMM maintains the publisher and subscriber list to support asynchronous exchange of information, especially to support the periodic get method. EMM is responsible for rule registration, execution, and notification to the subscribers. It provides a watchdog interface. Based upon the periodicity requirements of registered rules, EMM accesses the data published in DMM component using periodicGet call. Below we elaborate on the design elements of IES that match the five goals identified in Section 4.4.1. 4.4.2.1
Efficient use of limited memory
There are two aspects to efficiency in this context: firstly, prudent use of limited memory; secondly, fast access to the stored information. IES uses a block of pre-allocated memory as the information repository. The size of pre-allocation depends on the availability; however, in general it is the case that the amount of information that needs to be stored far exceeds 63
interface Publisher{ //register itself with IES command result_t register(uint16_t *attributeList, uint8_t attributeCount, uint8_t publisherID) //deregister itself as a publisher of an attribute command result_t deregister(uint16_t attribute, uint8_t publisherID); //put data. command result_t put(uint16_t attribute, uint8_t *value, uint8_t size, uint32_t expiry, uint8_t stickiness); //serve a request to publish data event result_t dataRequestEvent(uint16_t attribute); } interface Subscriber{ //subscribe for a periodic get command result_t periodicGet(uint16_t attribute, uint32_t periodicity, uint8_t subscriberID); //unsubscribe from a periodic get command result_t unsubscribe(uint16_t attribute, uint8_t subscriberID); //get data command result_t get(uint16_t attribute); //handle the notification of data availability command result_t dataAvailableEvent(uint16_t attribute, uint8_t *value, uint8_t size); } interface Watchdog{ //register a rule, periodicity=0 means rule to be // checked as frequently as possible command result_t notificationRequest(uint8_t ruleID, uint32_t periodicity); //handle the notification from IES event ruleSatisfactionEvent(uint8_t ruleID); }
Figure 24: IES API summary. the size. IES uses an LRU eviction policy when information has to be retired from it. There is a possibility that data may be retired from the memory before anyone requests it. For this reason, IES allows the producer to tag the data with a sticky bit to over-ride the LRU policy. Alternatively, IES also has the ability to asynchronously “pull” the data from a producer upon a request from a consumer. For fast access to common data, IES uses a small fully-associative cache to keep the frequently requested data. Motivation behind using the cache is that if some information is requested by one module, it will likely be requested soon by other modules as well. This is especially true in SensorStack because different modules cooperate to achieve some common goal, e.g., energy optimization and hence may be querying some common attribute from IES (such as application’s data requirement or the remaining battery level).
64
4.4.2.2
Simple interface for information sharing
IES provides a publish/subscribe interface to the shared memory for transparent sharing of information. Publishers can put information in standard data format, and subscribers can get the same without knowing the publishers. Since information is stored as attribute-value pairs, multiple publishers can publish the same information with different attributes. Protocol adaptation depends on the information provided by IES. Therefore, it is essential to ensure the freshness of data provided by IES. Producers need to know how frequently they need to update information published by them; consumers need to know if the information they are getting from IES is fresh. Asynchronous access (to be described shortly) deals with the former, while the latter is dealt with by the producers tagging information with an “expiration date”. 4.4.2.3
Extensibility
Extensibility is achieved by using standard interfaces and data formats. IES is accessed using get/put over an attribute id. get copies the value (if available) and returns the number of bytes corresponding to the data value; a return value of zero indicates that the data is currently unavailable. put writes the value into IES, and returns success/failure of the write operation as a boolean value. int get( int attribute id, byte[] value ); bool put( int attribute id, byte[] value, int size);
Every attribute id maps to a unique attribute description, an XML-based declarative description of the attribute. The attribute description corresponds to a unique entry in a standard ontology of information pertinent to the WSN. Given a declaration, the attribute id can be obtained by contacting an attribute name server. The idea of attribute name server is similar to a DNS lookup for an IP address. However, discussion of the name server design is outside the scope of this thesis. 4.4.2.4
Asynchronous access
With asynchronous access to IES through the publish/subscribe interface, there are four possibilities for information sharing between publishers (P) and subscribers (S) under the
65
arbitration of IES: push-push, push-pull, pull-push, and pull-pull. Push-Push choice yields the best result from the point of freshness of information but it has two downsides: There is a potential for wasted effort if there are no subscribers to published data that is being frequently updated. There is a potential for duplication of effort if multiple modules are publishing the same information. This may be a preferred choice for sharing neighborhood information that is prone to change quite frequently. Second choice is Push-Pull. In this case, P pushes all the information to IES, and S pulls the required information from IES. This choice has the disadvantages of push-push design, i.e., publishing of unwanted and redundant information, added with the lack of promptness in information sharing. S now has to keep polling IES for the available information. If there are many subscribers, the constant polling may overload IES and delay its promptness to serve the requests. The only upside to this choice is that IES has to do less work to deliver information to the subscribers. There are similar pros and cons for the last two choices: pull-push, and pull-pull. None of the above design choices serves best for exchange of cross-layer information; rather, different attributes may be best shared in different ways. For example, battery information may need to be shared in a reactive manner, while neighborhood information may be shared in a proactive manner. This observation motivated us to explore how to support all of the above design choices with a simple interface. While proactive communication can be handled by simple get and put methods, we added event based signaling in IES to support reactive communication. A subscriber can request for reactive access to data either by setting up a periodicity in the get call, i.e., the subscriber gets data periodically, or by using complex event notification service, where the subscriber gets notified whenever a specific condition is met. For supporting periodic update, a get call expects periodicity, and a put call expects expiration as extra parameters. IES uses two events for signaling an update: Data Request Event (DRE) to request a publisher to put data when data is either expired or unavailable in IES, and Data Available Event (DAE) to notify a subscriber of an available update. Figure 25 shows the use of asynchronous signaling to handle a failed get request because
66
IES
Subscriber
Publisher
1. get
2. get returns 0 3. DRE
4. put
5. DAE
Figure 25: Use of asynchronous signaling in IES when requested attribute is not available in IES. the requested attribute is not available in IES memory. This may happen because either none of the publishers put the attribute or the attribute was evicted, possibly expired, from the IES memory. IES selects a publisher (if any) for the requested attribute, and it raises DRE for that publisher. Once the publisher puts the attribute, IES notifies the waiting subscriber using DAE with a data pointer. The subscriber gets the data from IES. However, it may happen that before the subscriber handles the DAE event, the attribute gets evicted from IES, making the DAE void. To avoid an attribute from getting evicted before DAE is handled, IES keeps a time window before which the attribute is not evicted. A subscriber is expected to handle DAE within the time window, or else the subscriber must issue a fresh get call. Figure 26 shows the use of asynchronous signaling to handle periodic update request. IES periodically checks if the requested attribute has expired or is unavailable in the repository; it then signals the publishers with a DRE. IES maintains the periodicity by using multiple timers. Of course, because of the asynchronous nature, the periodicity cannot be guaranteed accurately; it may depend on how fast the publishers are able to handle DREs.
67
Subscriber
Publisher
IES 1. periodic get
2. DRE
3. put( energy, 6)
6. DAE 4. DRE
5. put( energy, 4)
6. DAE
Figure 26: Use of asynchronous signaling in IES for handling periodic updates. 4.4.2.5
Complex event notification
Often a protocol module may need to adapt its behavior when certain conditions are satisfied: changes in the environment, resource availability, and/or application requirements. Such adaptability to dynamic changes is quite common in wireless protocol stacks, and this goal is aimed at helping protocol modules monitor these changes in a fast and efficient manner. IES uses predicate based rule representation to capture complex conditions. A rule takes the form of ‘if condition do notify module P’. Conditions are well formed formulae over the IES attributes. For example, a simple rule can be ‘if (energy < 5) do notify routing module’. IES keeps checking if the specified condition is satisfied, and when satisfied, it notifies the respective subscribers with rule satisfied event (RSE). The two important design questions in this context are: how frequently should IES check for rule satisfaction, and how should IES handle the case when the condition attributes are not currently available in IES memory? There is a trade-off between the promptness of event notification and incurred computation cost. Owing to the resource constrained nature of sensor devices, IES checks for condition satisfaction only periodically. IES uses the frequency of access/updates to the
68
attributes to fine tune this periodicity. In case an attribute is unavailable at the time of checking, IES signals the publishers for the required attribute data. 4.4.3
Discussion
4.4.3.1
Information exchange across nodes
The focus of IES has been to facilitate information sharing across layers on a node, and not across the different nodes. It is obvious that information sharing across nodes will help individual SensorStack adapt better to the network changes than taking decision only locally. Information exchange across nodes, however, can be supported without changing the IES design. The responsibility of collecting remote information can be assigned to an independent layer, say information collection plane. The new collection protocol module accesses the information requirement from IES, it then collects the remote information and publishes it to the IES. The stickiness flag in put method can be used to express the collection cost of the data. If the collection cost is high, while applying the memory eviction policy, IES should give more weight to the data. We explore IES across nodes in Chapter 5. 4.4.3.2
Expressibility of EMM
EMM currently supports the rules to be expressed using predicates over only individual attributes and not over an arbitrary set of attributes. For example, a module cannot specify “notify me when a new node joins neighborhood,” for three reasons: first, neighborhood is a set of nodes; second, the new node attribute is still unknown; and third, IES design is not optimized for supporting a rule concerning a large number of attributes. However, assuming that the event subscriber module knows that the current neighborhood size is n, a module can express this notification request as “notify me when (neighborhoodSize > n).” 4.4.3.3
Data marshalling/unmarshalling
IES currently leaves the onus of data marshalling/unmarshalling to the user modules. However, IES can be extended to handle marshalling/unmarshalling using upcalls: user module provides handlers for each attribute type that are invoked as needed by IES before executing
69
the interface commands. We have not explored this extension because of lack of function pointer support in TinyOS. In Linux, it can be easily supported.
4.5
Implementation
This section describes IES implementation in TinyOS and Linux. We have implemented all three interfaces, Publisher, Subscriber, and Watchdog. TinyOS provides support for asynchronous communication among the components, which is very useful in implementing the event notification service. However, the static nature of TinyOS makes memory management restrictive, and event notification inefficient. Linux, on the other hand, does not provide direct support for asynchronous communication among kernel modules, and thus needs indirect mechanisms, e.g., our approach for kernel module notification uses workqueue. 4.5.1
IES in TinyOS
TinyOS is a component based operating system designed for concurrent operations and resource constrained embedded devices. Components provide interfaces to be used by others. An application is written as a set of components wired together using the interfaces and events. Though TinyOS itself provides only basic send and receive interface support over CSMA based radio control, the other layers (such as routing and fusion) are implemented as independent modules. The modules are statically wired together through their component interfaces to realize the network protocol stack. 4.5.1.1
Data management module
Since TinyOS is designed for resource constrained devices, e.g., Mica2 with 4 KBytes of RAM, it uses static memory optimization techniques to generate memory efficient codes. Because TinyOS does not support dynamic memory allocation, we allocate statically a chunk of memory to be used by IES, and use priority based eviction to control its usage. Every IES entry is of fixed length, that helps an easy and efficient implementation of DMM even without any dynamic memory support. However, this restriction limits the flexibility of get and put methods: the attribute value must be of fixed size, which is 4 Bytes in our case. Figure 27 depicts the DMM implementation. An IES entry is of 9 Bytes length,
70
Direct-mapped cache
Attribute
Data Bank (Set associative cache)
Array Index
Attribute 2 Bytes
Value 4 Bytes
Sticky 1 Byte
Expiration 2 Bytes
Figure 27: DMM memory hierarchy. Direct-mapped cache maps an attribute to its location in the data bank. with 2 Bytes for attribute, 2 Bytes for expiration time, and one Byte for maintaining sticky bits. Sticky field value is used to influence memory eviction policy. DMM is implemented as a two-level cache: first, a direct-mapped cache to keep frequently used attributes, and second, a set associative cache that we call the data bank. The first-level direct-mapped cache maps an attributeID to a unique index in the secondlevel cache. The data bank stores a list of attribute-value pairs. So, if the attributeID is available in the direct-mapped cache, then the corresponding index value is used to get the attribute value from the data bank. If the attributeID is not present in the first-level cache, then the data bank needs to be searched. By keeping the data bank set associative, the search space is reduced to the associativity factor. As an example, for a memory mapped cache of 8 entries, and a 16-way set associative data bank of total 256 entries (16 sets), each direct-mapped cache entry is of 24 bits (16 bit attributeID and 8 bit data bank array index). For a hit in the direct-mapped cache (we call a cache hit), an attribute is obtained in 2 accesses (one to the direct-mapped cache, and another to the data bank). For a miss in the direct-mapped cache (we call cache miss), an attribute is obtained in at most 17 accesses (one to the direct-mapped cache, and at most 16 to the data bank as there are 16 entries per set). In case of a miss in the data bank, asynchronous signalling is used to notify a producer (see Section 4.4.2.4).
71
4.5.1.2
Event management module
EMM implementation supports comparison based conditional rules. A module interested in being notified registers itself with the Watchdog interface. EMM, in turn, can register itself as a DMM subscriber for the attribute in the specified rule, and it can then periodically check the rule. Currently, periodic checking of rules is not implemented; rather, the checking is done whenever relevant attributes are updated through a put command. Another source of inefficiency comes from TinyOS limitations. A TinyOS application can be thought of as a set of modules, whose dependency graph needs to be specified statically at compile time. Because of this static nature, event subscription also becomes static. Thus all event subscriptions need to be encoded at compile time itself. In our implementation, we facilitate dynamic rule addition by a simple trade-off: we allow an event notification to be triggered when any one of a set of rules are satisfied. Thus a rule can be dynamically added to a rule set, but the rule satisfaction is notified to all the modules registered for any rule in that set. A rule satisfaction event has a rule identifier field, which can be used by the subscribers to filter the notifications of interest to them. 4.5.2
IES in Linux
Linux is not event-based; consequently, to provide asynchronous event notification support, a kernel thread is devoted to running an event queue reusing Linux’s “workqueue” mechanism that is primarily intended for deferred execution of code running in an interrupt context. The spawned kernel thread is dedicated to running items in an event queue and is made runnable only when events are pending. Events run in the order queued, and synchronization is required to queue and dequeue events. Actual latency from an event’s triggering to its execution is thus dependent on both how busy this event queue is and the overall workload of the system.
4.6
Evaluation
This section evaluates the effectiveness of IES in supporting cross-layering in SensorStack. First, through an extensive set micromeasurements, we investigate the overhead of data
72
Information access latency (uSec )
600
hsn Interface 500
IES interface
400 300 200 100 0 4
8
12
16
20
24
28
32
Neighborhood size
Figure 28: Comparing information access latency using HSN’s neighbors interface and using IES interface. In IES case, the neighborhood data is being accessed directly from the DMM cache. access through the IES interface for different scenarios, and compare them with the case where data is accessed directly through protocol modules’ interfaces. We also measure the overhead incurred in checking rules in EMM. Finally, we evaluate a complete protocol stack to quantify benefits of using IES, specifically in terms of application longevity and communication savings in data collection from neighboring nodes. Below, we first present the micromeasurements, and then we explore the macromeasurements. 4.6.1
Micromeasurements
Here we present IES overhead results on Mica2 (for TinyOS) and iPAQ 3870 (for Linux). For TinyOS experiments, we measure the data access latency for different DMM configurations, and observe that latency overhead increases linearly with the increase in data bank size and the number of subscribed attributes (see Section 4.6.1.1). Using a direct-mapped cache in front of the data bank reduces the latency significantly. For Linux experiments, EMM measurements on an iPAQ show that rule checking takes a very small fraction of time compared to the time taken by the publishers handling the data requests (see Section
73
s=16
1000
s=32
800
s=48
Average Latency (uSec)
Average latency (microSec)
1200
s=64
600 400 200 0 4
8
12
16
20
24
28
2500
s=16
2000
s=32
1500
s=48 s=64
1000 500 0 8
32
12
16
20
24
Access data size
Access Data Size
(A)
(B)
28
32
Figure 29: Memory access overhead for DMM in TinyOS. Figure (A) shows the results when attribute is present in the data bank, and Figure (B) shows the results when the attribute is not in IES. 4.6.1.2). 4.6.1.1
TinyOS Results
We use the SysTime interface of TinyOS for timing measurements on Mica2 platform. SysTime provides timer values at 1.2 micro seconds granularity. The direct-mapped cache size is fixed to 32 entries, and the data bank size is fixed to 256 entries. Set associativity of the data bank is varied from 8 to 64. Each data point is an average over 100 readings. Figure 28 shows the memory access latency values when all the attributes are present in the direct-mapped cache. It also presents the latency values when the same information is accessed directly by invoking neighbors interface using the HSN routing component. As expected for this ideal case, IES memory access is much faster than access using the HSN routing module. IES allows only 4 Byte values for an attribute. As expected, the latency increases linearly with the increase in the number of attributes. Figure 29(A) shows the case when there is a miss from direct-mapped cache, and Figure 29(B) shows the case when there is miss even from the data bank. As the associativity is increased, the access latency increases linearly because of increase in the number of comparisons DMM has to do to get the attribute. In case of miss from the data bank, the latency results include the cost of signalling DRE, the publisher doing put, and finally
74
s
t as yn ch r
on
nk ba ta IE
S:
da S: IE
ou
hi
t hi ca ch e S: IE
H
SN
ac ce ss
Latency (uSec)
1600 1400 1200 1000 800 600 400 200 0
Figure 30: Memory access overhead comparison for DMM in TinyOS. For IES case, 32-way set associative data bank is used. signalling the DAE. Figure 30 compares the memory access latency of accessing 32 Bytes data for various cases. It confirms the benefit of using the direct-mapped cache. When the data is available in the data bank, the memory access latency for a 32-way set associative data bank is comparable to that of directly accessing data from the HSN interface. For frequently accessed attributes, data access latency using direct-mapped cache is negligible compared to the latency using the HSN interface. If the data is neither in the direct-mapped cache nor in the data bank, the latency incurred is about three times more than the latency using the HSN interface. 4.6.1.2
Linux Results
The following experiments are performed on an iPAQ running Linux “familiar” distribution version 0.6.1. To measure the latency incurred in accessing an attribute through the IES interface, we use the following workload. There are three configurable parameters: data bank size, direct-mapped cache size, and the number of producer modules. The workload is varied by changing the number of producer modules from 20 to 100. For all these experiments, there are 10 consumer modules. Every 10 seconds, a consumer executes “get” for an attribute produced by a randomly chosen producer. The data bank and the direct-mapped cache are allocated as a contiguous chunk of memory.
75
0.08
1 1
0.07
10
100
Producers 100 Producers 50
0.06
Producers 50
Producers 20 0.1
0.05
Producers 20 Producers 100
CPUtime(miliseconds)
CPUtime(miliseconds)
0.04 0.03 0.02
0.01
0.01 0 0
200
400
600
800
1000
0.001
1200
Cache size (entries)
Store size (entries)
(A)
(B)
Figure 31: DMM performance on Linux: For (A), the direct-mapped cache size is fixed at 16 entries and the data bank size is varied; for (B), the data bank size is fixed at 256 entries and the direct-mapped cache size is varied. In legends, store size represents the size of the data bank, and CPU time represents the latency incurred in accessing an attribute from IES. Figure 31 shows the average latency incurred in accessing an attribute under different workload settings. In figure 31-A, the direct-mapped cache size is fixed at 16 and the data bank size is varied, and the experiments are run for 120 seconds. These results show that the latency increases linearly with the size of the data bank due to the increased setassociativity. In figure 31-B, the data bank size is fixed at 256 entries and the direct-mapped cache size is varied. The results show that the latency incurred decreases exponentially with respect to the size of the cache. These results are quite intuitive and expected. For EMM evaluation on Linux, we have a list of 20 rules, and each rule is subscribed by an “event consumer” module. A rule consists of a condition on two attributes. We maintain a hash table that maps an attributeID to an index in the rule list. When there is a change in the value of these attributes in DMM, the corresponding rules are executed to check if they are true. If a rule evaluates to true, we call it a rule match, an event is queued in the workqueue such that the dedicated workqueue thread can notify the consumer modules waiting on that rule. For e.g., if five rules match, five events are queued to notify the respective consumers. In summary, for a change in an attribute value, there are three steps involved in notifying an event consumer: first, check if the rules corresponding to
76
Last Event Execution Queuing Time
Average Event Execution Rule Searching
First Event Execution
300000
250000
Time (ns)
200000
150000
100000
50000
0 1
2
3
4
5
6
7
8
9
10
Number of Matching Rules
Figure 32: EMM performance on Linux: Time spent in checking the rules is very minimal compared to the time taken by a handler routing in serving the notification event. In legends, “number of matching rules” represents the number of rules to be evaluated, and time represents the delay incurred for different steps. the changed attribute match; second, queue an event for every matching rule;f and third, dequeue the events and notify the consumer modules waiting on the matched rule. Figure 32 shows the time measurements for the above three steps involved in EMM. We see that the time spent in matching the rules is very small (on the order of 200 ns per rule) and scales roughly linearly with the number of rules matching. The second step, which consists of allocating and queuing rule satisfaction events, takes approximately 2 microseconds per triggered event. There is a latency of approximately 37 microseconds per event involved in the completion of the third step. 4.6.2
Macro Evaluation
To understand and quantify the benefits of using IES in context of a complete protocol stack, we have assembled the example protocol stack shown in Figure 20; we call it OldStack for the purpose of the following description. We have implemented DFuse in TinyOS to run as the fusion application on top of the OldStack. GPSR protocol is used for routing, and its code 77
has been obtained from UCLA. Finally, we interfaced OldStack with IES implementation to obtain the SensorStack. Thus, the only difference between OldStack and SensorStack is the optimization done using IES. Below, we explain SensorStack to show the benefits of using IES between the fusion layer and the data service layer. 4.6.2.1
IES between Fusion and Service Layers
Our evaluation stack consists of the DFuse at the fusion layer, and the GPSR at the service layer. DFuse is an architecture for programming fusion applications. Given a fusion task graph, DFuse maps the task graph as an overlay over the physical network. In DFuse, a network node can play only one of three roles: end point (source or sink), relay, or fusion point [4]. An end point corresponds to a data source or a sink. The network nodes that correspond to end points and fusion points may not always be directly reachable from one another. In this case, data forwarding relay nodes may be used to route messages among them. For the purpose of explaining the benefits of using IES with DFuse, we here briefly revisit the algorithm DFuse uses for task graph mapping. 4.6.2.2
DFuse role assignment algorithm
DFuse uses a distributed role assignment algorithm for placing fusion points in the network. Role assignment is a mapping from a fusion point in an application task graph to a network node. The inputs to the algorithm are an application task graph (assuming the source nodes are known), a cost function, and attributes specific to the cost function. The output is an overlay network that optimizes the role to be performed by each node of the network. The role assignment algorithm is based upon a simple heuristic: first perform a naive assignment of roles to the network nodes (initialization phase), and then allow every node to decide locally if it wants to transfer the role to any of its neighbors (optimization and maintenance phase). During the optimization and maintenance phase, every node hosting a fusion point role is responsible for either continuing to play that role or transferring the role to one of its neighbors. The decision for role transfer is taken by a node based solely upon local
78
information. A node (hosting a fusion point) periodically informs its neighbors about its role and its health, an indicator of how good the node is in hosting its current role. Upon receiving such a message, a neighboring node computes its own health for hosting that role. If the receiving node determines that it can play the role better than the sender, then it informs the sender (fusion node) of its own health and its intent for hosting that role. If the original sender receives one or more intention requests from its neighbors, the role is transferred to the neighbor with the best health. Thus, with every role transfer, the overall health, with respect to the cost function, of the overlay network improves. 4.6.2.3
Application longevity improvement
We have performed simulation-based experiments using TOSSIM [37] to see how much communication improvement can be achieved by using cross-layer information in the SensorStack. Towards that, we have implemented a sample cost function, Minimize transmission cost (MT). This cost function aims to decrease the amount of data transmission required for running a fusion function. Input data needs to be transmitted from sources to the fusion point, and the output data needs to be propagated to the consumer nodes (possibly across hops). For a fusion function f with m input data sources (fan-in) and n output data consumers (fan-out), the transmission cost for placing f on node k is formulated as: cM T (k, f ) =
m X
t(sourcei ) ∗ hopCount(inputi , k)
i=1
+
n X
t(f ) ∗ hopCount(k, outputj )
j=1
Here, t(x) represents the transmission rate of the data source x, and hopCount(i, k) is the distance (in number of hops) between node i and k. Figure 34 shows that the network cost improves with time because of using role migration and the MT cost function described above. Network cost here means the total number of bytes (corresponding to the application data) being transmitted by all the participating nodes. This experiment uses a binary tree based task graph with 1 sink, 3 fusion nodes,
79
10 9 8 7 6 5 4 3 Data Source 2
Data Sink
1
Fusion point: Initial mapping Fusion point: after role migration
0
11
22
33
44
55
66
77
88
99
110
Figure 33: Task graph overlay over a grid topology: 11x11 network with 20 feet spacing. and 4 data sources, for a 121 node physical network placed in a grid topology with inter-node spacing of 20 feet. Figure 33 shows the mapping of the task graph over the physical network. Simple radio model from TOSSIM has been used that does not account for packet loss with distance. Transmission range is set to 35 feet, thus every node has a neighborhood size of 8 nodes. A drop in the network cost corresponds to a role migration that brings a fusion point closer to its producers or/and consumers (in number of hops). A drop in network cost also entails an improvement in energy saving, thus an increase in the application longevity. 4.6.2.4
Control overhead improvement
To support role migration, the fusion layer periodically probes its neighboring nodes to collect their health information. This health information is based on the cost function in use. Neighborhood information is collected by the routing layer (GPSR) as well to take its routing decisions. Figure 35 shows the periodic control overhead incurred by the fusion layer and the routing layer. These numbers correspond to a network of 121 nodes placed as grid. To change the neighborhood size, the inter-node spacing in the grid is varied between 8 to 15 feet, while keeping the transmission range fixed to 50 feet. For radio model, we have used LossyBuilder, a Java tool provided by TOSSIM for generating loss rates from physical
80
Total Network cost (application data) Bytes/minute
2500
2000
1500
1000
500
0 1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
Time unit (minutes)
Figure 34: Network cost improvement with time because of role migration using cost function MT. topologies. The tool models loss rates observed empirically in an experiment performed by Woo et al. on a TinyOS network [19]. LossyBuilder assumes each mote has a transmission radius of 50 feet. The empirical model shows that even when nodes are kept at 15 feet, their link reliability is quite low. Hence, we found that for an inter-node spacing of 15 feet, the average neighborhood size is only 4; at 10 feet, the neighborhood size is 12; and at 8 feet, the neighborhood size is about 24 nodes. Results show that as the number of fusion points in the input task graph is increased, control overhead of the fusion layer increases, implying that the potential for saving (by using IES to use the routing layer information) increases. Similarly, an increase in neighborhood size increases the potential savings. For a task graph with 8 fusion points, and for a network with average neighborhood density of 12 nodes, using IES saves about 112 packets/period, where the period depends on the periodicity of cost function evaluation in DFuse.
4.7
Related Work
Related work to cross-layering can be broadly divided into two groups: the first considers all the layers together in a holistic way, and the second considers pairs of protocol layers.
81
250 Collection overhead saving (Packets / period)
nbr size = 24 200
nbr size = 12
150
nbr size = 4
100 50 0 1
2
3
4
5
6
7
8
9
No. of Fusion Points
Figure 35: Potential saving in control overhead that can be achieved by information sharing via the IES. In legends corresponding to the fusion layer overhead, 4,12,24 are the neighborhood sizes, i.e. number of nodes which are queried for their health information. period is a configurable parameter, that represents the periodicity of neighborhood information exchange among nodes. SensorStack falls in the first group; however, it uses the findings from the specific crosslayering instances between layers. MobileMan project [15] also has similar goal to SensorStack to support cross-layering in a centralized way by facilitating information sharing. But there are two main differences between IES architecture and MobileMan’s architecture [14]. First, instead of providing centralized shared memory, MobileMan provides call-back based approach such that consumers can directly access producer’s private data. This approach implies that consumer has to know the publisher, the consumer has to do early binding to the producer, and asynchronous access to data becomes difficult. Second, conditions for asynchronous access are set as black-box functions instead of predicates over shared variables. Using their approach, even when there may not be any change in the shared data, every condition has to be checked periodically, thus leading to inefficiency. Researches from Berkeley have proposed a sensor network architecture that takes a micro kernel approach. They advocate bringing down the standard interfaces to the applications from the transport layer of the Internet stack to the link layer [16]. The proposed link layer
82
abstraction, SP [61] also aims to share neighborhood information and message pool with all other protocols. While sharing of information is motivated similarly to IES, SP is confined only to link layer information, and they do not provide generic publish/subscribe interface like IES. Also, SP does not allow rule-based event notification as done in IES. Message passing and information exchange is inherent in any distributed system, especially systems providing network services, e.g., directory service, peer-to-peer systems, network monitoring, and sensor networks. Recent research projects [83, 85, 73, 84] have looked at scalability and self-configuration aspects of utilizing these distributed information for controlling the application and network behavior. While there is rich insight into accessing remote information even in a very large network, past research has mostly ignored how to use this information to facilitate developing adaptive applications and adaptive protocol stack. Network weather service [83] provides mechanisms to gather network information by employing soft sensors and statistical prediction algorithms. Astrolabe [73] provides a scalable system for disseminating sensor information in a large distributed system. Astrolabe uses aggregation techniques to provide compressed view of a network zone while still providing consistency guarantees. The declarative interface of IES is motivated by research work on publish/subscribe systems [48, 22]. Distributed systems research community has looked extensively at eventbased systems, and such research has helped in shaping the IES design choices.
4.8
Summary
This chapter presents a proof of concept that modularity and adaptability can be achieved simultaneously in a network protocol stack. We observe that cross-layering is important to achieve adaptability, but doing so arbitrarily limits modularity. To solve this problem, we decouple cross-layer data from the functionalities provided by the layers. Based on this idea, we present the design of a novel Information exchange service (IES) to facilitate the cross-layering. The publish/subscription based data management module helps achieve modularity by standardizing the cross-layer interaction, and rule-based event management
83
module helps achieve adaptability by supporting reactive notification of changes. We present a simple taxonomy for cross-layer information sharing that provides transparency without affecting modularity. We share our experience in implementing IES on TinyOS and Linux. TinyOS provides support for asynchronous communication among the components, which comes in handy for supporting the event notification service. However, the static nature of TinyOS makes memory management restrictive, and event notification inefficient. Linux, on the other hand, does not provide direct support for asynchronous communication among kernel modules, thus needing indirect mechanisms. We have presented results that show that the cross layer information gathering adds little overhead to the basic functionality of the stack. This chapter has presented IES that is limited to modules on a single node. IES on a single node helps node-level adaptability. But, to bridge the communication gap between the node-level and the network-level adaptability, there is a need to extend IES across nodes. In the next chapter, we explore the problem of designing a data dissemination service to support IES across nodes.
84
CHAPTER 5
INFORMATION DISSEMINATION SERVICE: IES ACROSS NODES
This chapter looks at the problem of how to minimize the communication overhead incurred in supporting the IES across nodes. While IES on a single node allows a protocol module to take advantage of data being collected by other modules, IES across nodes allows a node to take advantage of data being collected by other nodes. Using the intuition that usefulness of IES data is localized in nature, we designed an information dissemination service (IDS) as part of the SensorStack. We observed that using probabilistic broadcast to share IES data, and using query-scope to limit the broadcast, helped support IDS at low communication overhead.
5.1
Introduction
As described in Section 2.3, the data service layer in a sensor network provides two main services, namely, dissemination of (potentially fused) data to one or more neighbors, and reception of data packets for fusion or delivery to the application. We introduce a novel functionality into the data service layer of the SensorStack that is the focus of this chapter. By their very nature, applications running on sensor networks tend to be network centric. Therefore the nodes need to cooperate to meet application requirements. Thus the modules (implementing the different layers of the SensorStack) at a given node may need to adapt to changes in node conditions at remote nodes (published in their respective IESs). Thus the SensorStack modules at a given node may need to subscribe to remote IESs, and/or aggregate and share useful information from multiple nodes. For example, a fusion application (such as DFuse) may need to monitor the health of its neighbor nodes periodically for role assignment decisions. An energy conscious routing module may need the
85
same information simply for affecting its routing decisions. Thus queries for such information may emanate not only from the application level but also from different layers of the SensorStack to remote nodes. However, there may be considerable overlap in different queries. Consider two nodes issuing the same query, “which of my 2-hop neighbors have 80% battery level?”. If the two nodes are topologically close to each other, then considerable overlap in the results are likely as well. The results may be exactly the same if the two nodes share the same neighborhood. However, if the queries are processed independently in the network it would lead to significant overhead due to the duplication. We propose an information dissemination service (IDS) to provide a unified framework for information sharing among the nodes.
5.2
Problem Definition
The main objective here is to support continuous queries over remote IES data with minimal communication overhead. Since the problem space of data dissemination and continuous query over stream databases is vast, it is important to reemphasize that we confine our work to control data dissemination over a large sized FWSN environment. Control data differs from an application data in the following ways: • Want vs need: Application data needs to be disseminated to its subscribers, while IES control data is desired to be disseminated to remote consumer nodes to help improve adaptability. Thus, the tolerance level in delay or accuracy for control data is more than that for application data. This opens up an opportunity for optimizations in IDS that may not be possible while handling application data. • Localized broadcast vs multicast/unicast: Application data is mostly communicated in many-to-one or one-to-one fashion, with a scope of entire network; while control data is desired to be disseminated to all nodes within the region of interest. • Small vs large size: Application data size may vary from few bytes to many KBytes, but most control data packets will have a size of a few bytes. This makes it desirable to piggyback control packets, or to transmit them lazily in a digest format.
86
There are two main sources of redundant communication in supporting IES across nodes. First, there may be considerable overlap in the scope of queries being executed at different nodes. Second, a consumer may fire its query more frequently than the rate of change in the remote data; for e.g., even when there is no change in its temperature reading, a node may redundantly send its reading to a remote node. This chapter explores techniques to minimize such redundancy in communication, and provides IDS support in a scalable and adaptable manner. 5.2.1
Research Issues
To facilitate information sharing across nodes, IDS should provide a high-level query interface that is not only expressive, but also makes it easy to find possible overlap among different queries. A typical query may appear as follows: ‘‘select attributes {x, y, z} from all nodes within the scope S satisfying the conditions {p, q, r}, and deliver the results with a periodicity of T ." Supporting such a query model leads to the following research issues: 1. Detecting overlap among queries: Queries overlap if and only if the scope of the participating nodes as well as the collected results overlap. Detecting such an overlap helps control flooding of a query, as well as sharing of the results. Overlap detection is a challenge since it depends on so many factors, and needs to be done in a distributed and scalable manner. Though queries may not match one another exactly, often there may exist a partial overlap (either in their scope, attributes of the query, and/or the results). Defining a notion of partial overlap, and detecting such an overlap in a robust manner is a challenging issue. 2. Sharing results of overlapping queries: Once a query overlap is detected, the results can be shared to save on the network communication for the results. Sharing results among queries, where one query subsumes another, both in scope and attributes, is straightforward. But, if there is only a partial overlap among queries then executing the queries in the non-overlapping scopes, and avoiding duplication of results are challenging issues to be addressed. 87
5.3
Related Work
In-network aggregation of data has been an active area of interest in WSN community for quite a while [47, 56]. However, earlier research has focused on a homogeneous network with non-overlapping aggregation trees. IDS design is targeted for FWSN, where many aggregation trees can simultaneously co-exist with potential spatial overlap. Database literature is rich with techniques to find overlap in the attribute space of queries. However, spatial overlap among queries is still a relatively less explored area. Streaming database research [66] has explored cluster-based topology to support multiple task graphs; but since they use a task-based query model, they do not detect and utilize any spatial overlap among different queries. Location based queries have been proposed recently by the database community, e.g. “finding restaurants within a 5 mile radius”, but they have been mostly restricted to centralized approaches [91].
5.4
IDS Design
This section looks at the IDS design with respect to supporting the regionally scoped queries. Globally scoped queries are discussed in Section 5.6. 5.4.1
Region Creation
When a protocol module initiates a region-based query, first a region is created by informing all the neighboring nodes that may lie within the scope. To limit flooding of beacon packets, a region is scoped either in terms of number of hops or physical distance from the initiator node. A beacon packet consists of a region identifier, scope, and a set of predicates. A node receiving a beacon checks if it lies within the scope of the region, and if it does not, it ignores the packet. Otherwise, the node becomes a member of the region and it broadcasts the beacon packet. Thus the beacon packet is propagated as a wave originating from the initiator node and the wave dies beyond the query scope. If a node satisfies the beacon’s predicates, it also acts as a producer node, i.e. it will need to send the query response to the consumer node. For simplicity, we assume here that the response is sent back periodically, and we will look later how the response can be
88
optimized to decrease the communication overhead. 5.4.2
Data dissemination
An obvious option to disseminate data from the producer nodes to their consumers is to create data diffusion trees, with the consumer nodes at the tree roots. Since there will be as many trees as the queries, supporting data sharing across different trees becomes important as the number of trees grows. However, efficient sharing across different diffusion trees is a problem in itself, and typical solutions (explored in Section 5.6) have to incur the overhead of tree maintenance and also disseminating tree attributes continuously to allow sharing of the tree by other nodes. Because of such overheads, using diffusion trees for control data dissemination becomes inefficient as the number of trees grows. Another option is to simply flood the produced data to the whole region. As the number of consumer nodes increase within a region, the overhead of flooding per consumer node decreases, especially when the consumer nodes are subscribed to same control data. Since control data packets are smaller, and they can be piggybacked with each other to be sent as a single packet, even consumers with differing attribute requests help to amortize the flooding cost. IDS uses both of the above techniques depending on the network conditions. To optimize the communication overhead, flooding is scoped within a query region only. Also, to avoid the broadcast storm problem [71], we control the rebroadcast probability; thus the resulting flooding protocol is called probabilistic broadcast (pbcast). The next section quantifies how different conditions affect the choice of using pbcast or a tree for communication.
5.5
Evaluation
The main purpose of the following experiments is to find out when to use probabilistic broadcast and when to use diffusion trees to share IDS data. We have implemented both ways of sharing of IDS data in TinyOS. We do not implement sharing across multiple diffusion trees, nor do we use reverse multicast techniques to build an efficient diffusion tree. To understand the protocol behavior, we simulated IDS using TOSSIM [37] for different network connectivity graphs. 89
Table 5: Simulation Parameters for the IDS experiments Simulation Parameter Physical spacing (e) Region Size (s)
No. (n)
of regions
Forwarding probability (p) Query periodicity Transmission range
Description
Values
Effect on experiments
Inter-node spacing of the 11x11 grid topology. Scope of a query. Represented in physical distance (radius with respect to the query initiator) No. of nodes initiating an IES query over remote data. All region heads are located around the grid center. Probability of rebroadcast, only relevant for pbcast based IDS. Frequency of query execution.
10,15 feet.
Inter-node spacing affects the size of neighborhood of a node. Affects the no. of node with in the query scope.
Radio transmission range.
Fixed, 50 feet
1.5, 2, 2.8, 3 times e 1 to 8.
Amount of potential sharing among the diffusion trees.
0.1 to 1
Affects the pbcast overhead and reliability. Affects latency of update propagation and overhead incurred. Affects a node’s neighborhood size.
Variable
Following experiments are based on the empirical radio model supported in TOSSIM, where every link is used as a lossy channel with loss probability based on empirical data. Instead of a perfect radio model, we use the empirical radio model because it allows us to see the effect of packet loss during broadcast. TOSSIM provides a Java tool, LossyBuilder, for generating loss rates from physical topologies. The tool models loss rates observed empirically in an experiment performed by Woo et al. on a TinyOS network [19]. LossyBuilder assumes each mote has a transmission radius of 50 feet. Thus, each mote transmits its signal to all nodes within 50 feet radius range, and the quality of received signal decreases with the increase in distance from the transmitter. Given a physical topology, the tool generates packet loss rates for each pair based on the inter-mote distance. We use grid-based topology to get the loss pattern. By keeping the radio transmission power fixed (50’), inter-mote distance is varied, thus affecting the loss probability. The region heads, the nodes which initiate an IDS query, are located around the center of the grid. 5.5.1
IDS Overhead
IDS overhead measurement is important to quantify the cost of providing the IES data across nodes. The overhead depends on many factors, including the number of regions
90
450
350
pbcast (p=0.1)
300
tree
400 Transmission Overhead (Bytes/period)
Transmission Overhead (Bytes/period)
400
250 200 150 100 50
pbcast (p=0.2)
350
tree
300 250 200 150 100 50 0
0 1
2
3
4
5
6
7
1
8
2
3
4
5
6
No. of regions
No. of Regions
(A) e = 10' ; s = 2*e
(A) e = 15' ; s = 2*e
7
8
Figure 36: IDS overhead for executing multiple queries. Figure A shows the overhead for a grid topology with 10’ spacing, and figure B shows that for a grid with 15’ spacing. pbcast incurs low overhead than explicit collection using multiple diffusion trees for different queries. where remote queries are being executed, scope of the regions, a node’s neighborhood size, and periodicity of the queries. For pbcast, it also depends on the forwarding probability. Effect of some of these factors on the IDS overhead is predictable, while for many others it is not. For example, intuitively speaking, the overhead should increase with an increase in the number of regions, or with an increase in the forwarding probability of pbcast, or with an increase in the scope of the queries. But it is difficult to predict whether a tree-based IDS will have more overhead than a pbcast-based IDS under different scenarios. To understand which factors among the above are more important than others, we measured IDS overhead under varying conditions. Figure 36 compares tree and pbcast approaches. Both diagrams (A & B) show that IDS overhead increases with an increase in the number of query regions. But because of using low forwarding probability, pbcast incurs lower overhead than the tree approach. Also, with an increase in the query number, the difference in IDS overhead between pbcast and tree increases. This is because, with more regions overlapping with each other, pbcast saves more on the communication overhead compared to the tree based data collection. If we allow the trees to share data and benefit from the overlap, using diffusion tree may be a viable approach for supporting IDS. But, the overhead of finding such overlaps and
91
pbcast (p=0.1)
Quality of Decision (No. of responses / period)
Quality of Decision (No. of responses / period)
70
100 90 80
tree
70 60 50 40 30 20 10 0 1
2
3
4
5
6
7
8
60
pbcast (p=0.2)
50
tree
40 30 20 10 0 1
No. of Regions
2
3
4
5
6
7
8
No. of Regions
(B) e = 15' ; s = 2*e
(A) e = 10' ; s = 2*e
Figure 37: Reliability in terms of no. of responses received by the query initiators (region heads). Figure A shows the overhead for a grid topology with 10’ spacing, and figure B shows that for a grid with 15’ spacing. maintaining the multicast trees may not by justifiable for dynamic and short-lived queries. Since, pbcast does not have any overhead related to building and maintaining trees, it will perform better than the tree approach for short-lived queries and dynamic environment. 5.5.2
Quality of Decision
A decision may involve many factors, thus the quality of a decision (QoD) depends on the accuracy of information about the related factors. In SensorStack, quality of a decision by a protocol module will depend on the accuracy of a query result. It is very hard to measure the accuracy of a query result that involves multiple attributes, that are heterogeneous in nature, and whose values are time sensitive and requires collection from distributed nodes. For the sake of simplicity, we can define QoD as the number of received attributes. For example, if a query involves 10 attributes, and the query is executed using the received values of only 5 attributes (and using default or last-known values for the remaining 5), we say the QoD to be 50%. Similarly, for a query over remote IES data, QoD can be defined as the fraction of nodes (out of the total nodes in the query scope) whose IES data is received. Here, QoD assumes the knowledge about the number of nodes in a query scope. For example, if there are 10 nodes in a query scope, and the region head receives responses from 5 of them, then its QoD will again be 50%. QoD defined in terms of number of received responses has many
92
limitations. For a query that wants to find out the number of nodes in its query scope, QoD defined as above does not make sense. But for many other relevant queries, including finding averages or min/max/median, the result can be probabilistically expressed in terms of the number of responses. QoD depends on the network reliability and the data communication protocol used. Figure 37 shows how the number of received responses varies for difference scenarios. Figure A compares the two approaches (tree and pbcast) for a network with the inter-node physical spacing of 10’, and figure B does the same with the spacing of 15’. For a transmission range of 50’, a 10’ spacing entails larger neighborhood size than a 15’ spacing. Figure 37 A and B show that for the 10’ spacing case, both the tree and the pbcast results are comparable; but for 15’ spacing, pbcast provides higher QoD than the tree-based approach. 5.5.3
Comparison summary
In summary, pbcast provides a comparable or better QoD at lower transmission overhead than a tree-based IDS support. For a dense network and a smaller number of overlapping queries, a tree-based approach has similar overhead without compromising on QoD. Though pbcast can be tuned to have lower overhead by decreasing its forwarding probability, such reduction comes at the cost of QoD. But, for sparser network and large number of overlapping queries, pbcast provides much better QoD at low communication overhead than a tree-based approach.
5.6
Supporting global queries over IES data
We discuss preliminary ideas for supporting global queries in IES. A general design for global query support is quite challenging. As a first cut, for simplicity we assume a support service for location awareness, and queries are assumed to be geographically scoped. The two main research issues (Section 5.2.1) related to the IDS design can be dealt with in the following way: 1. Identifying a nearness metric: In a location-aware FWSN, nearness can be easily captured using physical distance. However, since sensor network links have been found
93
to be asymmetric in nature, physical distance may not be able to capture reliability and hop-count between nodes. Still, location based querying provides an easy way to express scope, where finding spatial overlap among queries becomes straightforward. We will use location-based scope definition as our first-cut solution to the problem, and extend it further by considering the reliability and asymmetry issues. It should be noted that relative coordinates suffice for location identification (i.e. there is no need for extra hardware such as GPS at a node). 2. Detecting overlap among queries and sharing results: Using location-based scoping for queries, the spatial overlap can be easily detected by finding the common areas from the scope. To share the results among overlapping scopes, we divide the topology into smaller clusters with a cluster head that knows the membership of the nodes within a cluster. There are several design choices to be explored: • Query-aware hierarchical clustering: Clustering allows information sharing at the cluster level, thus avoiding the overhead of flooding inside clusters (if the cluster head already has the query results). We will explore dynamically constructing hierarchical clusters to further optimize queries, perform result aggregation, and result sharing. A key design attribute will be keeping the overhead of cluster membership maintenance to a minimum. • Result Filters: Adding filters that are cognizant of “partial overlap” at cluster heads and/or at strategic nodes in the network is another design choice we plan to explore to aid in the aggregation and result sharing. • Dynamic role migration: Cluster heads are the nodes that act as the root of a local aggregation tree. These nodes are responsible for deciding if a query needs to be flooded inside a cluster. Recent research [89] indicates that cluster head and surrounding nodes incur much more communication than other nodes, and are prone to rapid energy depletion. We will explore precise definitions of the role played by a cluster head that allows role migration to other nodes within the cluster. 94
Formation of Clusters. Cluster formation and cluster head selection can be done using simple location-aware clustering algorithms used in prior research, e.g., LEACH [26]. This algorithm gives a two-level cluster hierarchy, with data sources at first level, and clusterheads at the second level. Query Optimization. A location-based scope of a query is represented as a tuple < x, y >, such that any node in a square region of length x and width y, with respect to the node’s current location, becomes the area of interest. Cluster heads, having a cluster-level view of all existing query trees and membership information, participate actively in query optimizations. For example, consider an aggregation tree rooted at a sink node, spanning the entire network. Now, if another sink wishes to build a query aggregation tree, and if the new tree physically overlaps with the earlier one, then within the overlap region, only cluster-heads need to report the results back to the new sink. The non-overlapping regions will be handled in the obvious manner. Cluster Head Role Migration. For load balancing of cluster head role, we use the role assignment algorithm from our earlier DFuse research [36]. Though DFuse was designed for fusion point role migration, it can be easily extended for cluster-head role migration as well. During the maintenance phase of the DFuse algorithm, every node hosting a particular role is responsible for either continuing to play that role or transferring the role to one of its neighbors. The decision for role transfer is taken solely by the current role player based upon local information. A role player, i.e., a cluster head in IDS, periodically informs its neighbors about its role and its health – an indicator of how good the node is in hosting that role. Upon receiving such a message, a neighboring node computes its own health for hosting that role. If the receiving node determines that it can play the role better than the sender, then it informs the sender its intent for hosting that role. If the sender receives one or more intention requests from its neighbors, the role is transferred to the neighbor with the best health. On Demand Clustering. Our first cut approach partitions the network into a static grid, where inter-node distance corresponds to one-hop transmission range. However, because of the inherent dynamic nature of FWSN, static clustering may not be suitable. Also,
95
quite often only a part of the deployed network may be involved in maintenance activities warranted by IDS (or even application level queries). Clearly, creating and maintaining clusters over the entire network may lead to unnecessary control overhead. On-demand clustering will reduce such control overhead.
96
CHAPTER 6
INFORMATION DISSEMINATION SERVICE: BULK-DATA BROADCAST
This chapter extends the probabilistic broadcast support in IDS further, for a larger scope and for bulk data. Efficient and reliable dissemination of information over a large area is a critical ability of a sensor network for various reasons such as software updates and transferring large data objects (e.g., surveillance images). Thus efficiency of wireless broadcast is an important aspect of sensor network deployment. Here, we study FBcast, an extension of pbcast based on the principles of modern erasure codes. We show that our approach provides high reliability, often considered critical for disseminating codes. In addition FBcast offers data confidentiality to a limited extent. For a large network, where every node may not be reachable by the source, we extend FBcast with the idea of repeaters to improve reliable coverage. Simulation results on TOSSIM show that FBcast offers higher reliability with lower number of retransmissions than traditional broadcasts.
6.1
Introduction
We consider the problem of information dissemination in FWSN. This is an important domain of research because of the multitude of potential applications such as surveillance, tracking and monitoring. FWSN nodes are resource constrained, and thus they are initially programmed with minimal software code, and are updated whenever needed. For such ondemand programming, broadcast is typically used to disseminate the new software, making broadcast-efficiency a very important aspect of FWSN deployment. An efficient wireless broadcast scheme must solve two key interrelated challenges: (i) Messaging Overhead : Traditionally, each node of a FWSN rebroadcasts any new data packet, resulting in too many unnecessary transmissions [71]. For example, if a software update of k packets is to be sent over a FWSN of n nodes, potentially k times n broadcasts
97
Table 6: Hardware Platform Evolution Mote Released Processor (MHz) Flash (code, kB) RAM (kB) µcontroller
WeC
dot
mica2
iMote
1999 4 8 0.5 Atmel
2001 4 16 1 Atmel
2003 7 128 4 Atmel
2003 12 512 64 ARM
could be sent out. The larger the number of broadcasts, the more cumulative power is consumed because of the communication. Furthermore, the increased messaging overhead introduces more collisions and thus affects the channel reliability. (ii) Reliability : The reliability of message dissemination is a key requirement for a sensor network to function properly. For example, in case of a global software update, if the software at all nodes are not updated reliably, the collected data may become erroneous, or the network may run into inconsistent state. To avoid such problems, reliable code dissemination becomes important. But, empirical results establish that wireless channels are often lossy [19], and in presence of channel loss and collisions, achieving high reliability becomes difficult. So far, two baseline approaches have been proposed in the literature, viz.,deterministic and probabilistic flooding [59]. It turns out that simple deterministic flooding protocols are quite inefficient to address the issues mentioned above. In a probabilistic approach, each node randomly decides whether or not to broadcast a newly seen data packet. These baseline approaches do not assume any extra information about the networks. Several variants and optimizations over these two baseline schemes have also been introduced [71, 69, 79, 39]. Typically, these derivatives either assume some structural information about the networks, e.g., the knowledge of the network neighborhood, inter-node distances, views on the possible local clusters and so on, or, the protocols rely upon additional rounds of messages, such as periodic ‘‘Hello” packets, and ACK/NACK packets following every broadcast. However, it may not often be possible to depend on any additional information for reasons that are specific to sensor networks. For example, the nodes may not be equipped with GPS, or deployed in an area with very weak GPS signals. The information on neighborhood,
98
distance, location, etc., may continue to change due to mobility and failures. The periodic ‘‘gossip” becomes expensive to support because of the transmission overhead incurred and dynamic nature of FWSN. Instead of a protocol that relies completely on controlling the communication, our intuition is to aid the messaging with computational pre/post-processing. The emerging hardware trend suggests that future sensors would have significant computing power. Table 6 presents a quick summary of the currently available sensor platforms [38]; devices such as an iMote have up to 64 KB of main memory and can operate at a speed of 12 MHz. Extrapolating into the future, the motes will soon possess as much computing resources as today’s iPAQs. However, while processor efficiency (speed, power, miniaturization) continues to improve, networking performance over the wireless is not expected to grow equally, merely because of the physical ambient noise that must be dealt with. Thus trading processor cycles for communication can offer many-in-one benefits in terms of smaller messaging overhead, less collision, enhanced reliability and reduced power consumption. We propose a new baseline protocol that is based on a fundamentally different principle, that of the forward error correcting codes (FEC).
1
Like the other baseline protocols, the
new protocol does not assume any structural knowledge about the sensor networks, and needs no extra round of messaging such as ACK packets. We suggest the use of FEC in order to improve the reliability for information dissemination in sensor networks. Using detailed simulation experiments, we show that, in the presence of channel loss, the protocol achieves higher reliability compared to the basic probabilistic broadcast, but at a much lower messaging cost. FEC not only provides more resilience to channel loss because of temporal and spatial losses and packet interference, but it also exposes to an application programmer multiple control knobs such as forwarding probability, stretch factor, and different encoding techniques through which an application may meet specific reliability needs. The contributions and the findings of this chapter can be summarized as follows: (i) We present a new design principle for wireless broadcast in sensor networks. The 1
Erasure codes are a class of encoding; a data packet (seen as a collection of small blocks) is blown up with additional redundant blocks (such as parity checksums) so that if some blocks are lost due to any kind of noise (external signal, faults, compromises), the original packet may still be reconstructed from the rest.
99
idea is to combine erasure coding with probabilistic broadcast technique. Founded on this FEC principle, the new FWSN broadcast protocol, FBcast, offers high reliability at low messaging overhead. The new scheme also provides additional confidentiality. FEC has earlier been used for asynchronous data delivery and IP multicast in wired networks, but to the best of our knowledge, ours is the first work to explore the viability of applying FEC in wireless sensor networks that have unique requirements and packet loss characteristics substantially different from wired networks. Ours is a vanilla data dissemination protocol that assumes no extra information about the underlying network. As we observe through our experiments, the transmission characteristics (such as signal strength and packet loss) vary fairly randomly as one goes radially outward from a data source; thus common assumptions, such as regular signal strength distribution over concentric circles or sphere, that are made by many of the other protocols do not hold true in reality. Using FEC based vanilla protocols in such a scenario becomes quite useful. (ii) We compare FBcast with probabilistic broadcast through simulation studies using the TOSSIM simulator [37]. Protocol efficiency can be evaluated over more than one axis, each of which can be potentially traded for another, e.g., reliability can be enhanced at the cost of extra messaging overhead, or spatial density of the sensors and so on. Thus a point-by-point fair comparison between these approaches may not always be possible. However, our experiments do suggest that, FBcast performs better over a larger parameter space formed by the metric-axes. (iii) We propose FBcast with repeaters to disseminate code over large area network. Over a multi-hop network, especially in sparse network deployment, traditional broadcast reliability decreases as one goes away from the data source. We extend the basic FBcast protocol with the idea of repeater nodes. Where to place the repeaters without much network information is a challenge. We present a novel heuristic to solve the repeater placement problem. We compare the performance of the extended FBcast protocol against a similar variant of probabilistic protocol, and find the new protocol more effective in increasing the broadcast coverage. The intuition behind the better performance is the use of forward error correcting codes: instead of every node shouting in a neighborhood to ensure
100
reliable dissemination, it is better for one node to shout with in-built redundancy, and others listen and reconstruct in a fault-tolerant manner. This approach reduces interference as we shall demonstrate in the evaluation section. The chapter is organized in the following way. Section 6.2 looks at the motivation and related broadcast protocols to place our work in context. Section 6.3 provides details of FBcast protocol. It also explains the encoding scheme used. Section 6.4 presents the implementation details of FBcast and preliminary results.
6.2
Background and Related Work
Baseline Approaches: So far, the data dissemination techniques for wireless networks can be divided into two major approaches, deterministic or probabilistic broadcasts. Simple flooding [52] is the most naive implementation of the first class. However, naively re-broadcasting can easily lead to broadcast storm problem [71], and hence the need for controlled density-aware flooding and multicast algorithms for wireless networks [17]. Simple flooding (depending on the placement of neighboring nodes) also suffers from the severe inefficiency that the effective additional coverage of a new broadcast can be as low as 19% of the area of the original circle that a broadcast can effectively reach. Variants and Optimizations: Several other optimizations can be applied to these baseline schemes. For example, location awareness can be exploited to overcome issues such as the hidden neighbor problem. However, such a scheme would require more sophistication such as GPS positioning. For a mobile environment, a location aware protocol can be quite useful [71]. Similarly, knowing one’s neighbors within a first few hops can help broadcasting efficiently. In this case, neighbors exchange periodic ‘‘Hello” messages amongst themselves so as to form an idea of their neighborhood. Upon a broadcast, the message is tagged with the identifier of the broadcasting node so that neighbors within a threshold number of hops need not rebroadcast them [40, 59]. The strategy of selectively suppressing the broadcast activity is known as pruning. The objective here is to find out exactly a set of nodes (that will broadcast data) so that no other node need to rebroadcast. These nodes constitute a Flooding Tree. Finding a minimal flooding tree is NP-complete [41]. The Dominant
101
Pruning (DP) algorithm is an approximation to finding the minimal flooding tree [41]. Lou and Wu proposed further improvements over DP that utilize two-hop neighbor information more effectively than DP [43]. Again, the neighborhood information is maintained by periodic gossiping that add additional transport overhead. Garuda [58] provides reliable downstream data broadcast using a minimum dominating set of core nodes, which provides a loss recovery infrastructure for remaining nodes. The overhead incurred by core selection and maintenance in Garuda may make it an expensive solution for dynamic networks. Similarly, since collision is a critical obstacle, one intuition is to have many of the nodes stay off from transmissions, and thus create a virtual sparser topology that will have less collisions. Irrigator protocol and its variants are based on this idea [53]. For a comparative and comprehensive survey of these protocols, the reader can refer to a related work[79]. A randomized variant of controlled flooding is by the epidemic and gossiping algorithms for data management in large-scale distributed systems [73]. PSFQ [76] uses hop-to-hop error recovery by injecting data packets slowly, while neighbor nodes use NACK-based fast repair. On a similar note, Trickle [33] combines epidemic algorithms and density-aware broadcast towards code dissemination in FWSN. Deluge [28] extends trickle where the required message is usually a large data object occupying several bytes. Deluge uses CRC (cyclic redundancy check) codes to fight channel noise. Another variant of gossiping over pbcast is the Fireworks protocol [53]. FBcast and Erasure Codes: FBcast, as a base approach, provides another alternative to simple deterministic and probabilistic broadcasts. Other smart adaptations such as location-aware retransmission, maintaining neighborhood and routing information are expected to boost its performance. FBcast is inspired by the already known use of erasure codes in the domain of digital storage and transmissions. It is a well understood concept that compared to naively replicating data, an erasure code can provide similar reliability at a much smaller space-overhead [77]. Several distributed and peer-to-peer storage systems have been built on this principle. In wireless broadcasting, the two existing baseline approaches (simple broadcast and pbcast) are in a sense similar to naive replication, i.e. during a broadcast a data packet is fully replicated. An erasure code however, breaks a
102
m n
Figure 38: FBcast at source: m original packets are encoded into n packets and injected in the network. m k Any k packets
Incoming new-packet queue
Figure 39: FBcast at recipients: k packets are chosen from received queue and decoded back to m. packet into smaller blocks and introduces redundancy in a way that is more space-efficient. Thus using FBcast ensures that reliability comes in-built within the original data packets, and therefore less support is needed from the messaging subsystem. Rate-less erasure codes, such as Fountain codes, form a critical ingredient of our approach. They were first proposed as a tool for efficient multicasting [7]. Later on, versions of such codes have been shown as useful tool for Peer to Peer file sharing and download purposes [49]. FBcast applies the idea of fountain encoding with the previously known scheme of probabilistic gossip, to achieve high reliability without the extra messaging overhead, in the domain of wireless networking where bandwidth, power and reliability are very critical issues to be addressed.
6.3
FBcast Protocol
Figure 38 and 39 pictorially represent FBcast broadcast protocol. The data to be disseminated consists of m packets. The source blows up these m packets into n packets, using the encoding scheme described below, and the encoded data is injected in the network. Each recipient rebroadcasts a packet, if it is new, with a probability p. When a recipient node has received enough data packets ( k ≥ m ). The exact value of k depends on the specific encoding scheme used. for decoding, it reconstructs the data and passes it to the
103
application. In order to encode and decode (using FEC) the nodes would need to know a random seed from which the rest of the code parameters can be generated using a pseudo random number generator. We assume that this seed and the generator (typically a very light-weight algorithm) is shared by all nodes. 6.3.0.1
Encoding schemes
Erasure codes provide the type of encoding that is needed to accomplish the protocol. In particular, it is desirable that the codes have following properties. (i) The ratio n/m (also known as the stretch factor) can be made arbitrarily large or small flexibly. In other words, one can generate as many redundant packets as needed, by a decision that can be taken online. (ii) There is no restriction on the packet length. (iii) Both encoding and decoding should be inexpensive. Standard erasure codes, such as the Reed-Solomon codes, are inflexible in the first two aspects, and are quite inefficient in performance. Although these codes allow any desired stretch factor, this can only be done statically. It is not easy to change n and k on the fly during the application runtime for the following reasons. First, these codes are defined over discrete mathematical structures called finite fields - and every time the stretch factor is to be readjusted, a new code needs to be defined, possibly requiring that a different finite field be constructed and this information should be disseminated to all the participating nodes. Second, the code length parameter n is always upper bounded by the order q of the underlying field; every time a higher stretch factor needs to be applied, a great deal of meta-information needs to be disseminated and computational overhead incurred. Third, the size of a symbol, i.e., the number of bytes treated as one unit of information, is also upper bounded by the field size; for a field size q, the largest unit of information treated at one time can be at most log q bits. In our setting, the size of one packet is essentially the symbol length of the code being used, and thus essentially the chunk of data that can be handled at one time is limited by this a priori fixed parameter. For a comprehensive discussion regarding the kind of problems posed by standard codes, the reader can refer to the works of Luby, Mitzenmatcher et al. [44, 45].
104
Fortunately, a modern class of erasure codes solves these problems effectively. These are called Fountain codes. The category-name fountain is suggestive - when one needs to fill up a cup of water from a fountain, there is no need to bother about which particular droplets are being collected, rather just enough number of drops to fill in the glass would be sufficient. Not all Fountain codes are equally flexible. The candidate ones that we are particularly interested in are the Luby Transform codes (LT codes [44]), Raptor and Online codes. We are interested in the codes that are rateless, i.e., can produce on-the-fly a fountain of encoded blocks from k original message blocks. For a pre-decided small number ², only k = (1 + ²)m number of data blocks out of this fountain suffice to reconstruct the original document. Moreover, there is no limitation on the symbol-size, i.e., the packet length - any number of bytes can be considered as a single unit of information. An example of a rate-less code is the Luby-Transform codes [44]. Our idea is to generate the blocks and sprinkle them over to the neighbors who would re-sprinkle a small fraction of the fountain. The main benefit of data encoding is three fold. (i) Enhanced reliability, which is achieved by adding extra information encoded in the data packets. Thus, if a node has noisy reception, it may not be able to receive all the data packets, yet, it can generate the original data. (ii) Data encoding decreases transmission overhead. Because of the redundancy, the recipients do not need to retransmit all the packets; each transmits only a few of what it receives, thus alleviating contention for the shared channel. (iii) The scheme provides data confidentiality as an extra benefit. Because of the shared nature of the wireless channel, confidentiality is often a requirement in wireless network deployment. To encode and subsequently decode the same data, the sender and receiver need to have a shared random seed. Hence, no eavesdropper can decode the information from the wireless channel.
6.4
FBcast Evaluation
We have implemented the communication aspect of FBcast protocol in TinyOS, i.e., we account for only packet transmission and reception. Though we do not implement the data
105
encoding and decoding in TinyOS, we utilize our experience of fountain code implementation, discussed below, on Linux to tune the FBcast parameter (stretch factor). While we explore the effect of encoding/decoding control parameters upon FBcast reliability, we do not evaluate their effect on the energy consumption or computational latency they add to the broadcast because of the focus of this chapter. To understand the protocol behavior, we simulated FBcast using TOSSIM [37] for different network sizes. For comparative study, we also implemented traditional probabilistic broadcast, pbcast in TinyOS. We have looked at three aspects of FBcast and pbcast: reliability, transmission overhead, and latency. Reliability is measured as the percentage of motes that receive the original message being disseminated. If a mote receives only some of the injected packets, it may not be able to reconstruct the original data; we assume this to be true for both FBcast and pbcast. Transmission overhead is the sum total of transmitted packets on all the nodes during the simulation time. The simulation time is kept long enough such that retransmissions by every mote is complete. Latency is the average time when motes are able to reconstruct original data after receiving enough packets, and it does not include the data encoding or decoding time. For FBcast, latency is the expected time when motes have received k packets, and for pbcast it is the expected time when motes have received all the injected packets. The FBcast parameters are set as follows: m = 10, n ∈ {20, 40, 60}, k = 14, and p is adjusted in proportion to n. More precisely, p varies from 1/n to 8/n, thus, for n=20, p is varied from 0.1 to 0.4. Putting this in words, the number of packets in the original data is 10. With a stretch factor of 2, FBcast encodes the data to 20 packets and injects them at the source. Our experiments reveal that a factor of 1.4 is sufficient, i.e., a mote that receives at least 14 distinct packets, can reconstruct the original data. In case of simple broadcast, only 10 packets are injected. For FBcast, value of p is kept proportionally low as n is varied. For n = 20 and p = 0.4, a mote is expected to retransmit 8 out of 20 new packets it receives. The retransmission overhead here thus becomes equivalent to that of pbcast with p = 0.8. For pbcast experiments, due to absence of any encoding or decoding, n = m.
106
A few words about the implementation of Fountain code. Our experience of implementing fountain codes suggests that by choosing m0 ≈ 1000 (number of message symbols) and n0 ≈ 6000 (number of encoded symbols), data can be reliably reconstructed from k 0 ≈ 1400 symbols. However, a bunch of symbols can be coalesced together to form a packet, e.g., by coalescing 100 symbols one generates message blocks of size m = 10 packets and encoded blocks of size n = 60 packets. The memory requirement is also within the limits of present motes. For example, to implement LT codes (a class of fountain codes), one needs to have in memory a bipartite graph of size n0 × log(m/δ) (see Theorem 13 in [44]). δ is a small constant (e.g., δ = 10−3 gives us very high accuracy in the decoding). Thus, for the parameter set we have presented in this chapter, and the most space-efficient representation of the data structures, the memory requirements would be a little over 60 KB, which is clearly not far from the current limits. Moreover, it is expected that the main memory will soon touch the limits of megabytes, thus paving for more time-efficient representations of the structures for these algorithms. In our TOSSIM experiments, we simulated a network of mica2 motes. These motes presently have only 8 Kilobytes of memory, not enough for a full-scale implementation. However, devices such as iMotes already have 64 KB memory, and it is only fair to assume that very soon, enough space will be available on these motes for running both OS and the applications of this order. 6.4.0.2
Results Summary
FBcast and pbcast both can achieve reliability close to 100%, but FBcast can do so for larger window of variation in the mote density, and at lower transmission overhead than pbcast. Also, while pbcast exposes only the forwarding probability to control its behavior, FBcast exposes the forwarding probability and the stretch factor as control. Thus FBcast can be adapted more flexibly to suit different deployment densities and reliability requirements. The repeater variants of pbcast and FBcast, designed for large network deployments, both have higher reliability compared to their original counterparts. However, FBcast variant is easier to configure and it attains more than 99% reliability for various deployment parameters at lower transmission overhead compared to the pbcast variant.
107
The rest of this section is organized as follows. First, after explaining the network model used, we start with simple experiments, where there is no rebroadcast, and observe the possible benefits of using FEC. Then we add probabilistic retransmissions to increase the reliability and increase the broadcast coverage. We also explore different ways in which FBcast can be configured. Finally, we add the idea of repeaters in FBcast to overcome the limitation observed for FBcast without repeaters, namely, broadcasting over a very large area. 6.4.1
Network model and assumptions
Unless specified otherwise, the following experiments are based on the empirical radio model supported in TOSSIM, where every link is used as a lossy channel with loss probability based on empirical data. Instead of a perfect radio model, we use the empirical radio model because it allows us to see the effect of packet loss upon broadcast. TOSSIM provides a Java tool, LossyBuilder, for generating loss rates from physical topologies. The tool models loss rates observed empirically in an experiment performed by Woo et al. on a TinyOS network [19]. LossyBuilder assumes each mote has a transmission radius of 50 feet. Thus, each mote transmits its signal to all nodes within 50 feet radius range, and the quality of received signal decreases with the increase in distance from the transmitter. Given a physical topology, the tool generates packet loss rates for each pair based on the inter-mote distance. For experiments that use the empirical radio model, we use a grid-based topology to get the loss pattern. By varying the grid size, inter-mote distance is varied, thus affecting the loss probability. The data source is assumed to be at the grid center because of the nature of the experiments. For experiments that use the simple radio model, the transmission loss probability between any two motes is the same for all mote pairs. Nodes are assumed to be located such that each node can listen to all the other nodes. In TOSSIM, network signals are modelled such that distance does not affect their strength, thus making interference in TOSSIM generally worse than the expected real world behavior. However, due to the TinyOS CSMA protocol, the probability of two motes, within a single cell, transmitting at the same time is very very low.
108
6.4.2
FBcast without any rebroadcast
We distinguish between rebroadcast and retransmission that we will maintain throughout the rest of the discussion. Whenever a node transmits the same message that it has transmitted in the past, we refer to the event as a retransmission. However, when a node is broadcasting a message that is received from another node, the event is called a rebroadcast. We first consider the case when a single source broadcasts a message and there is no other rebroadcast following this event. However the source may retransmit the message multiple times. Reliability is defined as the fraction of nodes that are able to receive (reconstruct) the original message. Thus, reliability depends on packet loss, which in turn depends on multiple factors, including the bit-error rate and interference at the receiving node. Since there is no rebroadcast, there will be no interference. Experiments noted in Figures 40, 41, 42 look at the effect of bit-error rate on reliability. We observe that using FEC improves reliability compared to simply re-injecting the original packets multiple times, but without any extra mechanism, or rebroadcasts, it does not provide enough reliability. 6.4.2.1
Simple radio model
Figure 40 shows that in the presence of bit-errors, i.e. channel loss, reliability can be improved by using FEC and/or by increasing the stretch factors. This deployment corresponds to the scenario when all motes are deployed very closely, and there exists a temporal channel loss because of weather condition or external factors, resulting in all mote-pairs experiencing similar transmission loss behavior. Figure 40-A compares the transmission of a message (originally 10 packets long) using two alternatives - using FEC and blowing up to 20 packets and injecting all of them, versus re-injecting the original packet twice. Both alternatives add same transmission overhead. The graph shows that if we add 10 redundancy packets to a message of original size 10 packets, and inject 20 packets using FEC, out of which any 14 packets guarantees the reconstruction of original data, we get more reliability compared to naively injecting 10 packets twice. We see that FEC offers better reliability at a similar transmission cost. For a stretch factor of 4 (i.e, 40 packets for FEC and re-injection) the advantage of FEC is more
109
1
1
n=20: w/o FEC n=20: with FEC
0.8
Rea li biitly
Rea li biitly
0.6 0.4 0.2
0.8
n=40: w/o FEC
0.6
n=40: with FEC
0.4 0.2
0 4
8
1.
1.
6 0.
1
2 0.
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Bit-error Probability (%)
Bit-error Probability (%)
(A) Stretch factor = 2
(B) Stretch factor = 4
Figure 40: The effect of forward error correction (FEC) on reliability for varying bit-error probabilities using a simple radio model. Figure A compares a stretch factor of 2.0 versus the source retransmitting the message twice. Figure B corresponds to similar experiments, but with a stretch factor of 4.0 and four retransmissions. The network consists of 121 motes deployed using the simple radio model. prominent (the right half of figure 40). 6.4.2.2
Empirical radio model
Since empirical radio model (see Section 6.4.1) is based on real world characteristics, we use it to validate the benefits in a real-deployment scenario. The loss characteristic in empirical radio model primarily depends upon the network topology. First we consider a homogeneous deployment of 121 motes on a grid with fixed inter-mote spacing of 5 feet, then we vary the spacing to verify the results. The central source broadcasts a 10 packet message. We then compare the effect of FEC against retransmission, i.e., comparing different stretch factors and equivalent number of retransmissions (both resulting in same number of packets inserted into the network). As shown in Figure 41, increasing the stretch factor (number of packets inserted) helps improve the reliability, but the improvement tapers off beyond a certain stretch factor. Since motes are being deployed uniformly at 5 feet spacing, larger number of motes implies a larger area. Also, the distance from the central source mote to the edge of the network increases, increasing the bit-error probability, and decreasing the number of packets being received successfully. Again, as observed earlier, using FEC is better than simply re-injecting the same packets multiple times. 110
0.7 0.6
Reliability
0.5 w/o FEC 0.4
with FEC
0.3 0.2 0.1 0 10
20
30
40
50
60
70
80
90
100
Packets Inserted
Figure 41: Effect of forward error correction (FEC) on reliability using the empirical radio model. For the experiments corresponding to Figure 42, we vary the inter-mote spacing while keeping the number of motes constant (121). At lower inter-mote spacing, reliability is higher than at larger inter-mote spacings, and for a size of 121 mote deployment, spacing of 2 feet provides complete reliability using FEC with a stretch factor of four. If we look at the number of packets being received at different motes, because of the probabilistic nature of channel error, the resulting topological distribution pattern for successful reception is quite dynamic across different simulation runs. Topologies obtained for two typical runs are shown in Figure 43. There exists a set of concentric bands of motes that receive similar number of packets, but the bands are not circular, nor are they identical across different runs. The bands are neither circular nor repeatable because of the nature of wireless medium and mutual interferences. This means there is no simple way in which we can divide a large area into smaller cells and put a broadcast server into each cell to provide reliability in a large area. Hence, we resort to another intuitive alternative to increase reliability, i.e., by doing a probabilistic rebroadcast at intermediate motes.
111
1
Reliability
0.9 0.8
n=10, k=10
0.7
n=40, k=14
0.6 0.5 0.4 0.3 0.2 0.1 0 2
3
4
5
6
7
8
9
10 11 12 13 14 15
Mote Spacing (feet)
Figure 42: Effect of inter-mote spacing on reliability. 6.4.3
FBcast with probabilistic rebroadcast
When a node is allowed to do probabilistic rebroadcast of new packets, the results show that for the same deployment of 121 motes as before, we can do reliable broadcast to all the motes even at higher inter-mote spacing (than mere 2 feet) is achievable by increasing the forwarding probability at intermediate motes. Both Pbcast (probabilistic broadcast variance without FEC) and FBcast (FEC variant) achieve complete reliability, but as shown in Figures 44 and 45, FBcast achieves higher reliability (Figure 44-A and 45-A) than Pbcast at lower transmission overhead (Figure 44-B and 45-B). Let us represent the forwarding probability p as α/n, where α is the number of forwarded packets, and n is the original number of packets. At first glance, it may appear that for a given α, say p = 10/n, FBcast always gives higher reliability than Pbcast. But if we look closely, we find that there is no direct correlation in the reliability of Pbcast and FBcast for the same α in p. For example, with p = 10/n, and spacing of 10 feet, Pbcast has better reliability than FBcast, but at spacing of 8 feet, FBcast gives better reliability. This can be explained if we look at the transmission overheads of Pbcast and FBcast that are shown in Figures 44-B and 45-B. Contrary to intuition, we see that at p = 10/n, at spacing of 10 feet, Pbcast transmits
112
S11
S11
S9
S9
S7
S7
S5
S5
8-10 6-8 4-6
0-2
11
9
7
S1 5
S1 3
S3
1
S3
11
9
7
5
3
1
2-4
1.2
12
1
10
Transmissions
Reliability
Figure 43: Topographical picture of reliability for two typical runs showing how the reliability decreases as we move away from the grid center. 121 motes placed on 11x11 grid with inter-mote spacing of 5 feet.
0.8 0.6 0.4 0.2
p = 1/n p = 2/n
8
p = 4/n
6
p = 6/n
4
p = 8/n
2
p = 10/n
0
0 2
3
4
5
6
7
8
9
2
10
3
4
5
6
7
8
9
10
Mote Spacing
Mote Spacing
(A)
(B)
Figure 44: Pbcast performance for 121 motes deployed on a 11x11 grid with varying inter-mote spacing (x-axis). about 10 packets per mote, while FBcast transmits only about 4 packets per mote. We expected both Pbcast and FBcast to incur same amount of transmissions, i.e., about 10 packets per mote (p being 10/n). The amount of transmissions explain why FBcast has lower reliability than Pbcast. But to understand why FBcast has lower transmission than Pbcast for the same α, we can look at a simple model shown in Figure 46. The three motes, placed in a straight line with one-hop spacing, incur only 11 retransmissions in the case of FBcast, compared to 17 extra transmissions (due to rebroadcast) in the case of Pbcast. This is because though we limit the number of extra transmissions by changing α, the amount of new packets received at distant hops is not proportional to n for Pbcast and FBcast, thus FBcast does fewer transmissions due to rebroadcasting. Also, we have observed that at
113
12
1
10
Transmissions
Reliability
1.2 0.8 0.6 0.4 0.2 0
p = 1/n
8
p = 2/n
6
p = 4/n
4
p = 6/n
2
p = 8/n
0 2
3
4
5
6
7
8
9
10
2
3
4
5
Mote Spacing
6
7
8
9
10
p = 10/n
Mote Spacing
(A)
(B)
Figure 45: FBcast (with probabilistic rebroadcast) performance for 121 motes deployed on a 11x11 grid with varying inter-mote spacing (x-axis).
First hop: Source: Receives 9 pkt Transmits 10 packets Pkt loss prob = 0.1 Forwards 9 pkt (p = 10/10)
Second hop Receives 8 pkt Forwards 8 pkts
PBcast scenario: Number of retransmissions is 17 packets
First hop: Source: Receives 36 pkt Transmits 40 packets Pkt loss prob = 0.1 Forwards 9 pkt (p = 10/40)
Second hop Receives 8 pkt Forwards 2 pkts
FBcast scenario: Number of retransmissions is 11 packets
Figure 46: Transmission overhead comparison of Pbcast and FBcast for a simple topology. higher inter-mote spacing, the number of transmissions for Pbcast decreases and the curve becomes similar to what we show in the case of FBcast. From the above results, we learn that FBcast can provide higher reliability than Pbcast for similar amount of retransmissions; but it may not necessarily mean that FBcast will provide higher reliability than Pbcast for the same α. Still, the reliability is limited at higher inter-mote spacing. How much is it possible to stretch the reliability by increasing FBcast’s stretch factor? To answer this, we look into the results shown in Figure 47. For a deployment of 121 motes with 10 feet inter-mote spacing, we can achieve close to 100% reliability at higher stretch factors (e.g. 6) and at high forwarding probability, shown in Figure 47. As expected, increasing stretch factor improves the reliability and also increases the number of retransmissions. Also, for the same stretch factor, increasing the forwarding probability improves reliability. However, for stretch factor of 2 (n = 20), we
114
observe that reliability first improves, peaks at p = 16/20, and then it goes down rapidly. To understand this anomaly, we look at the effect of the rate of broadcast at the data source. Consider two extreme cases: first, when the source packets are inserted into the network at a very slow rate, and second, where data packets are being inserted without much delay between the transmissions. At a slow rate of source broadcast, there is less interference, and thus higher reliability, compared to the case where data packets are being inserted rapidly. The interference becomes more severe in the presence of probabilistic rebroadcast at the intermediate motes. This is because, when there is a large number of new packets being inserted at the source, there is an increase in the number of retransmissions at the other nodes, and this leads to collision due to hidden terminal problems or other interference issues. The effect of interference is also observed in our experimental results shown in Figure 48, where we compare the reliability of FBcast with n = 20 for 121 motes deployed at 10 feet inter-mote spacing. When the data source injects roughly one packet per second, we observe that reliability suffers heavily at high forwarding probability: though the number of retransmissions shown in Figure 48-B is very high, the number of successful receipt is low (see Figure 48-A). However, when the data source slows down the rate of packet broadcast (a packet roughly every 2 seconds), the reliability increases continuously until it reaches 100% at higher forwarding probabilities. The amount of transmission overhead increases, but so does reliability, indicating that the interference effect is subdued because of the slower rate of source data broadcast. The effect of interference is less apparent for higher stretch factors because of the basic property of FEC-based data recovery, i.e., even if some of the packets are lost due to interference, other motes will be able to reconstruct the original data. 6.4.3.1
Need for FBcast extension
From the above results, it is clear that FBcast can be adapted more flexibly to suit different deployment densities and reliability requirements. However because of the number of parameters involved, the complexity of packet loss characteristics, and the probabilistic
115
n=20
14
n=40
12
n=60
10 Transmissions
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
8 6 4 2
Forwarding Probability
8/ n 10 /n 12 /n 14 /n 16 /n 18 /n 20 /n
6/ n
4/ n
2/ n
1/ n
10 /n 12 /n 14 /n 16 /n 18 /n 20 /n
8/ n
6/ n
4/ n
2/ n
0
1/ n
Reliability
1 0.9 0.8
Forwarding Probability
(A)
(B)
Figure 47: Effect of stretch factor on reliability. Mote spacing = 10 feet. 121 motes deployed on a 11x11 grid. Forwarding probability is varied along x-axis. nature of FBcast, there is no simple expression that captures FBcast reliability for different parametric settings and network conditions. In the following discussion we explore how we can achieve high reliability for various network sizes without dynamically adapting the stretch factor or forwarding probability. In doing so we look at the limitation of FBcast in covering large deployments, which leads to our solution using repeater extensions. 6.4.4
Protocol Extension with Repeaters
In the experiments in the Section 6.4.3, all the motes are within the broadcast range of the source (referred to as single-hop experiments). The network topology used here is once again a grid of motes, but unlike the earlier single hop experiments, here the mote density is kept the same while increasing the number of motes, thus expanding the deployment area. For example, a grid of 441 motes deployed with inter-mote spacing of s = 10’, will cover 200’ X 200’ area. With increase in the deployment area, the number of hops between the data source and the peripheral motes increases, realizing the effect of multi-hop communication. In the presence of such multi-hop communication, we want to measure the reliability of pbcast and FBcast protocols. For the following experiments, pbcast is set with p = 0.8 because pbcast with lower forwarding probability value has very low reliability. FBcast parameters are set to n = 40, p = 2/40. Figure 49 shows the pbcast and FBcast reliability results. Even though, FBcast provides
116
1.2
20
Reliability
Transmissions
t = 1024ms
1 0.8
t = 2048ms 0.6 0.4
15 10 5
0.2
20 /n
16 /n
12 /n
8/ n
1/ n
20 /n
16 /n
12 /n
8/ n
4/ n
1/ n
4/ n
0
0
Forwarding Probability
Forwarding Probability
(B)
(A)
Figure 48: The effect of injection rate: how interference causes the reliability to drop at higher forwarding probability, though the amount of retransmissions increases as expected. higher reliability than pbcast, the reliability decreases with increasing deployment area. The fraction of motes being able to reconstruct the original data decreases with increase in the deployment area. There are two possible reasons for this result. First, hidden terminal problem is more severe here than in the single-hop experiments. For example, for a small deployment area, the source mote was found to be able to inject all 10 packets, but for a larger deployment area, the source mote had to retry injecting the original packets several times. Second, the peripheral motes are able to receive only a few or no packets. Because of channel loss and probabilistic retransmission, as we go away from the data source in the center, the number of received packets decreases. This is observed in single hop scenario also, but it is more evident for multi-hop scenario as shown in Figure 50. For these experiments, the inter-mote spacing is 10’. With 441 motes placed uniformly in a 200’x200’ area, the figure shows the number of packets received in different zones. For pbcast, with p = 8/10, the broadcast coverage is less than 5% of the area. As we increase n, we observe an increase in the coverage. But increasing n also has inherent cost (encoding/decoding cost), a very high n may not be the desirable engineering choice. Also, even with n = 60, the coverage is less than even 20%. Next, we explore how extending FBcast with repeaters extends the broadcast coverage.
117
1 s=6
0.6
s = 10
0.8 Reliability
Reliability
1 0.8
0.4 0.2
0.6 0.4 0.2
0
0 9
25
49
81
9
121 169 225 289 361 441
25
49
81
121 169 225 289 361 441 Motes
Motes
(A) pbcast Reliability Vs Area
(B) FBcast Reliability Vs Area
Figure 49: Probabilistic broadcast(p = 0.8) results for multi-hop scenario. The X axis shows the total number of motes deployed uniformly with inter-mote spacing of s = 6’ and 10’. A repeater is a mote that reconstructs the original data and acts as the data source, thus injecting the encoded data packets. For pbcast, being a repeater just means retransmitting the original data packets, and for FBcast, being a repeater means decoding the received packets to reconstruct the original data, encoding the data again to generate n packets, and re-injecting all the packets. Hence, only a mote that has received at least k packets can be a repeater. We design and evaluate an FBcast protocol with repeater motes. For a fair comparison of pbcast and FBcast, we also develop a repeater variant of pbcast and compare it with FBcast. 6.4.4.1
FBcast Extension with Repeaters
Because of unknown data source mote, unknown network topology and radio behavior, and probabilistic nature of the broadcast, a priori placement of repeaters is not desirable. A repeater should be selected dynamically, and such a selection poses a question: how can a mote decide to be a repeater based on the received packets? If the condition for being a repeater is very relaxed, there may be too many motes serving as repeaters (over-coverage), and if the condition is very tight, there will be too few repeaters to cover the whole area (under-coverage). Our repeater algorithm strikes a balance by utilizing the rate of packet receptions and the number of received packets. Figure 52 shows the pseudo code for FBcast with the repeater algorithm. Every mote calculates how long it should listen before it decides about becoming a repeater, call this 118
time is the listen window. At the end of a listen window, if a mote has enough packets to construct original data, but less than a threshold number of packets (kth ), the mote becomes a repeater. By having threshold check as a condition, we ensure that not every mote becomes a repeater, but only those that are able to receive a small fraction of the injected packets. This threshold condition will be satisfied by a band of motes, as it is clear from the one-hop results; around the data source, there exist concentric bands of motes with similar number of received packets. To ensure that not all motes in a particular band become repeates, we carefully randomized the length of initial listen window. Listen window consists of two parts: first part is the expected time when a mote will receive the threshold number (kth ) of packets, and the second is a random duration from 0 to t ∈ [0, tk ], where tk is the duration between reception of first and kth packets. The length of listen window affects the latency of packet dissemination in large area. For faster propagation, the listen window can be reduced; though the flip side of this choice is an increase in the number of repeaters per unit area. Figure 51 shows the results of FBcast deployed over a coverage area of 200’x200’ with motes spaced every 10’ along the grid points. Reliability is plotted on the Y-axis against the forwarding probability on the X-axis. We present the results for 2 different spacings, s = 60 and s = 100 . We see that for most of the cases the attained reliability is very high, except for one instance (because of the probabilistic nature of the protocol). The algorithm parameters are set as follows: n = 40, p = 8/40, and kth = 21. The value of threshold packet count, kth is important. Since a mote becomes repeater only if it receives packets between k and kth , setting kth too close to k decreases the probability of a mote becoming a repeater. In Figure 53A we show the topographical coverage attained by FBcast with repeater; for the chosen setting, complete coverage of the 200’x200’ area is attained. Most of the motes receive more than 21 packets. We found that setting kth = 1.5k for mote deployments with 10’ spacings or less enables this complete coverage. Figure 53B shows the position of repeater motes.
119
6.4.4.2
pbcast Extension with Repeaters
For pbcast, typically there is no way for a mote to know how far it is located from the data source (unless some extra information such as location about the neighborhood is provided). Thus, in this variant we assign a predefined probability (rp) of being a repeater to any mote that has received 10 packets. Figure 54 shows that pbcast with repeaters can provide complete coverage of the deployment area, albeit at the cost of high repeater probability. For rp = 0.6, the reliability is 95% for inter-mote spacing of s = 60 , but the reliability goes below 10% for spacing of s = 100 (sparser mote density). Increasing the repeater probability helps in achieving high reliability for sparser deployments, but that also increases interference; high values of rp essentially amounts to data flooding, and the consequent interference leads to the well known broadcast storm situation [71]. At an inter-mote spacing of 10’, we achieve close to complete coverage only for very dense repeater deployments (rp > 0.8). 6.4.4.3
Protocol Comparisons
The repeater variants of pbcast and FBcast both have higher reliability compared to their original counterparts. For sparser deployment, pbcast yields high reliability only with very high repeater probabilities, thus causing high transmission overhead. FBcast (with the aid of listen window and threshold number of received packets) is able to control the number of repeaters while ensuring more than 99% reliability for various deployment parameters.
6.5
Discussion and Limitations
We want to bring to the attention a few facts about FBcast. (1) The reliability guarantee is only probabilistic. But, some applications may demand absolute reliability. However, none of the baseline protocols, such as flooding or pbcast can provide absolute reliability. The base protocol in that case needs to be augmented with an ACK/NACK based scheme, wherein any node that finds out about the new information being disseminated, yet unable reconstruct the same, can contact other nodes proactively and obtain the data eventually. We establish in this chapter that as a baseline protocol
120
FBcast performs better than its counterparts. (2) In many applications, information need to be disseminated only to a specific region of nodes; in such a scenario, the repeater probability could be accordingly adjusted. Several other application specific demands may not be met by FBcast in its vanilla form, but we believe that the algorithm can be adjusted for other situations. (3) The source node in FBcast seems to suffer from a higher transmission cost, since it has to broadcast encoded information. However, this one time transmission is not expected to drain the entire power out of the source. Over the life time of one deployment, we expect that different nodes will act as source at different times. Thus the overall messaging overhead should get amortized. Thus FBcast should result in overall savings in messaging. (4) Data confidentiality is present in FBcast in a limited form. An eavesdropper will not be able to understand the messages. However it is vulnerable to node-compromise. Handling such issues is beyond the scope of FBcast. (5) The evaluation of FBcast in this chapter is carried out on a grid topology. However, this remains merely a starting point. Although a grid may not always be representative of a real life scenario, we note that there is nothing inherent in FBcast that depends on the deployment topology. In fact, there is every reason to believe that it will perform similarly on a random topology. The ability for a node to decide intelligently whether to assume a repeater’s role should give FBcast a further edge over pbcast.
6.6
Summary
We have presented a new broadcast protocol that exploits data encoding technique to achieve higher reliability and data confidentiality at low overhead. The simulation experiments show that with increased network density, traditional broadcast become quite unreliable, but FBcast maintains its reliability. The forwarding probability parameter of FBcast can be tuned to decrease the number of transmissions with higher density. Also, results show that as we move away from the data source, the number of received new packets decreases. We have proposed a repeater algorithm that reconstructs the original data and then re-injects the new packets into the network (when the number of received packet falls
121
below a threshold). FBcast trades off computation for communication. The data encoding (source) and decoding (recipients) consume computation cycles, but since computation is order of magnitude less power-expensive than communication, we expect to save power. Also, considering the computation needs of the encoding scheme, FBcast is suitable for computationally rich nodes. Based upon the continuing trend we believe that today’s handhelds are tomorrow’s motes, and FBcast will be quite suitable for the FWSN. Since FBcast exposes various parameters that can be tuned to achieve high reliability at different network conditions, it serves as an important tool in SensorStack to achieve adaptability. With application data broadcast being the focus, FBcast complements the pbcast-based control data broadcast (see Chapter 5) need of SensorStack.
122
35-42 28-35 21-28 14-21 7-14 0-7
S21
S21
S19
S19
S17
S17
S15
S15
S13
S13
S11
S11
S9
S9
S7
S7
S5
S5
S3
S3
S1 1
2 3
4 5
S1
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
1 2 3 4 5 6 7
n = 10
8 9 10 11 12 13 14 15 16 17 18 19 20 21
n = 20
S21
S21
S19
S19
S17
S17
S15
S15
S13
S13
S11
S11
S9
S9
S7
S7
S5
S5
S3
S3
S1 1 2
3 4 5
6 7
8 9 10 11 12 13 14 15 16 17 18 19 20 21
S1 1 2 3 4 5 6 7
n = 40
8 9 10 11 12 13 14 15 16 17 18 19 20 21
n = 60
Figure 50: Topography of successful reception for motes deployed with 10’ spacing in a 200’x200’ area.
123
s=6 s = 10
Reliability
1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 p = 1/40
p = 2/40
p = 4/40
FBcast Forwarding probability
Figure 51: Performance of FBcast (n = 40, p = 2/40) with repeaters for 441 motes deployed with inter-mote density s = 6’ and 10’.
Input: k, kth , packet arrival times Algorithm: t1 = time when 1st packet received tk = time when kth packet received At tk : 1. compute tkth = (tk − t1 ) ∗ kth /(k − 1) 2. compute trand = random value in {0,tk } 3. compute listenW indow = tkth + trand Wait for listenW indow duration; If( total packets received > kth ) return; // no need to be a repeater Else if( total packets received > k ) // be a repeater 1. reconstruct original data 2. encode data to get all n packets 3. transmit the new packets
Figure 52: Pseudocode for FBcast with repeater.
124
S21 S19
S21
S17
S19
S15 35-42
S17 S15
S13 28-35
S13
S11 21-28
S11
14-21 S9
S9
7-14 S7
S7
0-7 S5
S5 S3
19
16
13
10
7
4
0
S1 1
S1 1
21
19
17
15
13
9
11
7
5
3
1
S3
A. Topography for multi-hop scenario with repeaters (200'x200' area with grids of size 10'10')
B. Position of the repeater motes
Figure 53: FBcast with repeaters for motes deployed with 10’ spacing in a 200’x200’ area: (A) shows the coverage, and (B) shows the overhead in terms of number of repeaters and their positions.
1 0.9
s =6
0.8
Reliability
0.7
s=10
0.6 0.5 0.4 0.3 0.2 0.1 0 0.2
0.4
0.6
0.8
Repeater Probability (rp) Figure 54: Performance of pbcast (p = 0.8) with repeaters for 441 motes deployed with inter-mote density s = 6’ and 10’.
125
CHAPTER 7
CONCLUSIONS AND FUTURE WORK
In this thesis, we investigate research issues in the design of an adaptable protocol stack for future sensor networks. The centerpiece of the contribution of this dissertation is SensorStack, a software architecture for a novel protocol stack aimed towards FWSN. Using the philosophy of “local decisions leading to global optimizations”, the thesis presents three novel services towards an efficient support of adaptability in SensorStack. First, SensorStack employs a distributed role assignment algorithm for mapping an application task graph onto a physical network. Using a “cost function approach” to capture application requirements, we show that the system can adapt to changes in network conditions in an application specific manner. Since no single cost function is suited for all scenarios, we present a set of candidate cost functions and analyze their behavior. To evaluate a cost function, we often need to access data available at other layers. Also, because of the localized nature of the role assignment decisions, we observe that the role assignment algorithm overhead could be reduced if information about neighboring nodes could be shared among routing, MAC, and fusion layers. To support node-level adaptability efficiently, we provide an information exchange service (IES) in SensorStack that facilitates cross-layering across modules of the protocol stack. IES allows us to separate the data sharing needs among protocol modules from their core functionalities. Using an efficient memory management design, IES is able to reduce the latency for accessing information needed for decision making in the different layers of the protocol stack compared to using a vanilla routing (HSN) module for information gathering. Also, IES helps to reduce the communication overhead incurred in the periodic evaluation of the fusion layer’s cost functions. To facilitate cross-layering across nodes, we investigate techniques to disseminate information with minimal communication overhead. While IES on a single node allows a module
126
to take advantage of data being collected by other modules, IES across nodes allows a node to take advantage of data being collected by other nodes. Using the intuition that usefulness of IES data is localized in nature, we design an information dissemination service (IDS) as part of the SensorStack. Experiments show that using probabilistic broadcast to share IES data, and using query-scope to limit the broadcast result in low communication overhead. Finally, the thesis investigates the probabilistic broadcast using error correcting codes and repeaters to broadcast large volume data in large regions. Simulation experiments show that probabilistic broadcast is very helpful in realizing high reliability at low communication overhead; further, repeaters help in increasing the broadcast range.
7.1
Directions for Future Research
This thesis has proposed services for adaptability that are layer-independent. Our evaluations, however, have focused on the interactions between fusion and routing layers only. Further investigation is needed to understand the effect of different MAC protocols on the role assignment and the IDS behavior. Also, we have used homogeneous and static deployment for our experiments. We have not studied how node mobility affects the SensorStack’s adaptability. Along these lines, there are two main avenues for future research: design and evaluation of new cost functions taking mobility into account, and study of MAC layer effects on overall SensorStack adaptability. Fusion layer currently takes an application task graph as an input. However, further investigation is needed to understand how best to capture FWSN application logic as a task graph. For example, the role assignment algorithm assumes knowledge of the data sources for deploying a task graph on the sensor network. This assumption is limiting for addressing data centric queries since such knowledge is not available a priori. To handle this additional source of dynamism in task graph deployment, one possible approach is to have an interestdissemination phase before the initialization phase of the role assignment algorithm. During this phase, the interest set of individual nodes (for specific data) is disseminated as is done in directed diffusion [30]. When the response packets from the sources reach the sink (root node of the application task graph), the source addresses are extracted and recorded for
127
later use in other phases of the role assignment algorithm. Clearly, there is an opportunity for further investigation along these lines. Information dissemination service in SensorStack uses priority of the information to adapt the forwarding probability of IES data packets. How best to prioritize information depends on many factors, including application requirements and dynamic network conditions. Information prioritization is a non-trivial problem due to the variety of information being gathered in the nodes, and the distributed and dynamic nature of the network. Also, though we investigate supporting prioritized dissemination of IES data at the routing layer, further research is needed in supporting data priority at the transport layer. Designing a comprehensive information dissemination service for a heterogeneous sensor network is an interesting avenue for future research.
128
REFERENCES
[1] “http://www.palmzone.net/article.php?sid=571, accessed July 2006.” [2] Adhikari, S., Paul, A., and Ramachandran, U., “D-stampede: Distributed programming system for ubiquitous computing,” in Proceedings of the 22nd International Conference on Distributed Computing Systems(ICDCS), (Vienna), July 2002. [3] Ahamad, Y. and Cetintemel, U., “Network-aware query processing for streambased applications,” in Proc. of the Int’l Conf. on Very Large Databases (VLDB), 2004. [4] Bhardwaj, M. and Chandrakasan, A., “Bounding the lifetime of sensor networks via optimal role assignments,” in IEEE INFOCOM, 2002. [5] Bhattacharya, S., Kim, H., Prabh, S., and Abdelzaher, T., “Energy-conserving data placement and asynchronous multicast in wireless sensor networks,” in MobiSys ’03: Proceedings of the 1st international conference on Mobile systems, applications and services, (New York, NY, USA), pp. 173–185, ACM Press, 2003. [6] Boulis, A., Han, C. C., and Srivastava, M. B., “Design and implementation of a framework for programmable and efficient sensor networks,” in The First International Conference on Mobile Systems, Applications, and Services(MobiSys), (San Francisco,CA), 2003. [7] Byers, J. W., Luby, M., Mitzenmacher, M., and Rege, A., “A digital fountain approach to reliable distribution of bulk data,” in Proceedings of ACM SIGCOMM, pp. 56–67, 1998. [8] Cayirci, E., Su, W., and Sankarasubramanian, Y., “Wireless sensor networks: A survey,” Computer Networks (Elsevier), vol. 38, pp. 393–422, March 2002. [9] Cerpa, A. and Estrin, D., “Ascent: Adaptive self-configuring sensor networks topologies,” in Proceedings of Infocom, 2002. [10] Chandrasekaran, S., Cooper, O., Deshpande, A., Franklin, M. J., Hellerstein, J. M., Hong, W., Krishnamurthy, S., Madden, S. R., Raman, V., Reiss, F., and Shah, M. A., “TelegraphCQ: Continuous dataflow processing for an uncertain world,” in Proc. of the First Biennial Conference on Innovative Data Systems Research (CIDR), 2003. [11] Chen, B., Jamieson, K., Balakrishnan, H., and Morris, R., “Span: An energyefficient coordination algorithm for topology maintenance in ad hoc wireless networks,” in Mobile Computing and Networking, pp. 85–96, 2001. [12] Chen, J., DeWitt, D. J., Tian, F., and Wang, Y., “NiagaraCQ: A scalable continuous query system for internet databases,” in Proc. of ACM SIGMOD Conference on Management of Data, 2000. 129
[13] Cherniack, M., Balakrishnan, H., Carney, D., Cetintemel, U., Xing, Y., and Zdonik, S., “Scalable distributed stream processing,” in Proc. Conf. for Innovative Database Research (CIDR), 2003. [14] Conti, M., Maselli, G., and Turi, G., “Design of a flexible cross-layer interface for ad hoc networks,” in Fourth Annual Mediterranean Ad Hoc Networking Workshop, June 2005. [15] Conti, M., Maselli, G., Turi, G., and Giordano, S., “Cross-layering in mobile ad hoc network design,” IEEE Computer, vol. 37, pp. 48–51, Feb 2004. [16] Culler, D., Dutta, P., Ee, C. T., Fonseca, R., Hui, J., Levis, P., Polastre, J., Shenker, S., Stoica, I., Tolle, G., and Zhao, J., “Towards a sensor network architecture: Lowering the waistline,” in The Tenth Workshop on Hot Topics in Operating Systems (HotOS X), June 2005. [17] Eugster, P. T., Guerraoui, R., Handurukande, S. B., Kouznetsov, P., and Kermarrec, A.-M., “Lightweight probabilistic broadcast,” ACM Trans. Comput. Syst., vol. 21, no. 4, pp. 341–374, 2003. [18] Frank, C. and Romer, K., “Algorithms for generic role assignment in wireless sensor networks,” in SenSys ’05: Proceedings of the 3rd international conference on Embedded networked sensor systems, (New York, NY, USA), pp. 230–242, ACM Press, 2005. [19] Ganesan, D., Krishnamachari, B., Woo, A., Culler, D., Estrin, D., and Wicker, S., “An empirical study of epidemic algorithms in large scale multihop wireless networks,” 2002. Technical Report, Intel Research. [20] Garey, M. R. and Johnson, D. S., Computers and Intractablility: A Guide to the Theory of NP-Completeness. San Francisco: W. H. Freeman, 1979. [21] Gay, D., Levis, P., von Behren, R., Welsh, M., Brewer, E., and Culler, D., “The nesc language: A holistic approach to networked embedded systems,” in PLDI ’03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, (New York, NY, USA), pp. 1–11, ACM Press, 2003. [22] Gerkey, B. P. and Mataric, M. J., “Murdoch: Publish/subscribe task allocation for heterogeneous agents,” in Proceedings of the Fourth International Conference on Autonomous Agents (Sierra, C., Gini, M., and Rosenschein, J. S., eds.), (Barcelona, Catalonia, Spain), pp. 203–204, ACM Press, 2000. [23] Gu, X. and Nahrstedt, K., “On Composing Stream Applications in Peer-to-Peer Environments,” IEEE Transactions on Parallel and Distributed Systems (TPDS), 2005. [24] He, T., Stankovic, J. A., Lu, C., and Abdelzaher, T., “SPEED: A Stateless Protocol for Real-Time Communication,” in Proceedings of ICDCS 2003, 2003. [25] Heidemann, J., Silva, F., and Estrin, D., “Matching data dissemination algorithms to application requirements,” in Proceedings of the first international conference on Embedded networked sensor systems, pp. 218–229, ACM Press, 2003.
130
[26] Heinzelman, W., Chandrakasan, A., and Balakrishnan, H., “Energy-efficient Communication Protocols for Wireless Microsensor Networks,” in International Conference on System Sciences, (Maui, HI), January 2000. [27] Hill, J., Szewczyk, R., Woo, A., Hollar, S., Culler, D. E., and Pister, K. S. J., “System architecture directions for networked sensors,” in Architectural Support for Programming Languages and Operating Systems, pp. 93–104, 2000. [28] Hui, J. W. and Culler, D., “The dynamic behavior of a data dissemination protocol for network programming at scale,” in Proceedings of the 2nd international conference on Embedded networked sensor systems, pp. 81–94, 2004. [29] “i-Mote2, http://embedded.seattle.intel-research.net/wiki/index.php?title=Intel Mote 2, accessed July 2006.” [30] Intanagonwiwat, C., Govindan, R., and Estrin, D., “Directed diffusion: a scalable and robust communication paradigm for sensor networks,” in Mobile Computing and Networking, pp. 56–67, 2000. [31] Intel Research, Berkeley, “Heterogeneous Sensor Network: http://www.intel.com/research/exploratory/heterogeneous.htm; Software available in TinyOS 1.1.10 snapshot from Sourceforge.net.” [32] Jae-Hwan Chang and Leandros Tassiulas, “Energy conserving routing in wireless ad-hoc networks,” in IEEE INFOCOM, pp. 22–31, 2000. [33] Jae-Hwan Chang and Leandros Tassiulas, “Trickle: A self-regulating algorithm for code propagation and maintenance in wireless sensor networks,” in Proceedings of the First ACM/Usenix Symposium on Networked Systems Design and Implementation (NSDI), 2004. [34] Karp, B. and Kung, H. T., “GPSR: greedy perimeter stateless routing for wireless networks,” in Mobile Computing and Networking, pp. 243–254, 2000. [35] Kawadia, V. and Kumar, P. R., “A cautionary perspective on cross layer design,” IEEE Wireless Communication Magazine, vol. 2, pp. 3–11, Feb 2005. [36] Kumar, R., Wolenetz, M., Agarwalla, B., Shin, J., Hutto, P., Paul, A., and Ramachandran, U., “Dfuse: a framework for distributed data fusion,” in SenSys ’03: Proceedings of the 1st international conference on Embedded networked sensor systems, (New York, NY, USA), pp. 114–125, ACM Press, 2003. [37] Levis, P., Lee, N., Welsh, M., and Culler, D., “Tossim: accurate and scalable simulation of entire tinyos applications,” in Proceedings of the first international conference on Embedded networked sensor systems, pp. 126–137, ACM Press, 2003. [38] Levis, P., Madden, S., Gay, D., Polastre, J., Szewczyk, R., Woo, A., Brewer, E., and Culler, D., “The emergence of networking abstractions and techniques in tinyos,” in Proceedings of the First ACM/Usenix Symposium on Networked Systems Design and Implementation (NSDI), 2004. [39] Li, L., Halpern, J., and Haas, Z., “Gossip-based ad hoc routing,” in Proceedings of the 21st Conference of the IEEE Communications Society (INFOCOM’02),., 2002. 131
[40] Lim, H. and Kim, C., “Multicast tree construction and flooding in wireless ad hoc networks,” in Proc. ACM Workshop on Modelling, Analysis and Simulation of Wireless and Mobile Systems, 2000. [41] Lim, H. and Kim, C., “Flooding in wireless networks,” Computer Communicatins, vol. 24, no. 3-4, pp. 353–363, 2001. [42] Liu, H., Roeder, T., Walsh, K., Barr, R., and Sirer, E. G., “Design and implementation of a single system image operating system for ad hoc networks,” in MobiSys ’05: Proceedings of the 3rd international conference on Mobile systems, applications, and services, (New York, NY, USA), pp. 149–162, ACM Press, 2005. [43] Lou, W. and Wu, J., “On reducing broadcast redundancy in ad hoc wireless networks,” IEEE Transactions on Mobile Computing, vol. 1, April-June 2002. [44] Luby, M., “Lt codes,” in Proceedings of 43rd Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2002. [45] Luby, M. G., Mitzenmacher, M., Shokrollahi, M. A., and Spielman, D. A., “Effiicient erasure correcting codes,” IEEE Transactions on Information Theory, vol. 47, no. 2, pp. 569–584, 2001. [46] Luo, H. and Pottie, G. J., “Routing explicit side information for data compression in wireless sensor networks.,” in DCOSS, pp. 75–88, 2005. [47] Madden, S. R., Franklin, M. J., Hellerstein, J. M., and Hong, W., “Tag: a tiny aggregation service for ad-hoc sensor networks,” in Operating System Design and Implementation(OSDI), (Boston,MA), Dec 2002. [48] Mathur, A., Hall, R., Jahanian, F., Prakash, A., and Rasmussen, C., “The Publish/Subscribe Paradigm for Scalable Group Collaboration Systems. Technical Report CSE-TR-270-95, University of Michigan, EECS Department, 1995..” [49] Maymounkov, P. and Mazieres, D., “Rateless codes and big downloads,” in Proc. of the 2nd International Workshop on Peer-to-Peer Systems, 2003. [50] “”MICA2”,http://www.xbow.com/Products/productsdetails.aspx?sid=72.” [51] Motwani, R., Widom, J., Arasu, A., Babcock, B., Babu, S., Datar, M., Manku, G., Olston, C., Rosenstein, J., and Varma, R., “Query processing, resource management, and approximation in a data stream management system,” in Proc. of the First Biennial Conference on Innovative Data Systems Research (CIDR), 2003. [52] Obraczka, K., Viswanath, K., and Tsudik, G., “Flooding for reliable multicast in multi-hop ad hoc networks,” Wireless Networks, vol. 7, no. 6, pp. 627–634, 2001. [53] Orecchia, L., Panconesi, A., Petrioli, C., and Vitaletti, A., “Localized techniques for broadcasting in wireless sensor networks,” in Proceedings of the 2004 joint workshop on Foundations of mobile computing, pp. 41–51, ACM Press, 2004.
[54] “ORiNOCO PC Card (SilverGold) Specification: http://www.hyperlinktech.com/web/orinoco/orino 2003. 132
[55] Palchaudhuri, S., An Adaptive Sensor Network Architecture for Multi-scale Communication. PhD thesis, Computer Science, Rice University, 2006. [56] PalChaudhuri, S., Kumar, R., Baraniuk, R. G., and Johnson, D. B., “Design of adaptive overlays for multi-scale communication in sensor networks.,” in DCOSS, pp. 173–190, 2005. [57] Papadimitriou, C. and Yannakakis, M., “Towards an architecture-independent analysis of parallel algorithms,” in STOC ’88: Proceedings of the twentieth annual ACM symposium on Theory of computing, (New York, NY, USA), pp. 510–513, ACM Press, 1988. [58] Park, S.-J., Vedantham, R., Sivakumar, R., and Akyildiz, I. F., “A scalable approach for reliable downstream data delivery in wireless sensor networks,” in MobiHoc ’04: Proceedings of the 5th ACM international symposium on Mobile ad hoc networking and computing, (New York, NY, USA), pp. 78–89, ACM Press, 2004. [59] Peng, W. and Lu, X.-C., “On the reduction of broadcast redundancy in mobile ad hoc networks,” in Proceedings of the 1st ACM international symposium on Mobile ad hoc networking & computing, pp. 129–130, IEEE Press, 2000. [60] Pietzuch, P., Ledlie, J., Shneidman, J., Welsh, M., Seltzer, M., and Roussopoulos, M., “Network-aware operator placement for stream-processing systems,” in Proc. of the Int’l Conf. on Data Engineering (ICDE), 2006. [61] Polastre, J., Hui, J., Levis, P., Zhao, J., Culler, D., Shenker, S., and Stoica, I., “A unifying link abstraction for wireless sensor networks,” in SenSys ’05: Proceedings of the 3rd international conference on Embedded networked sensor systems, (New York, NY, USA), pp. 76–89, ACM Press, 2005. [62] Ramachandran, U., Nikhil, R. S., Rehg, J. M., Angelov, Y., Paul, A., Adhikari, S., Mackenzie, K. M., Harel, N., and Knobe, K., “Stampede: A cluster programming middleware for interactive stream-oriented applications,” IEEE Transactions on Parallel and Distributed Systems, vol. 14, no. 11, pp. 1140–1154, 2003. [63] Ramakrishnan, K., Floyd, S., and Black, D., “The addition of explicit congestion notification (ECN) to IP. RFC 3168, IETF, Sep. 2001.” [64] Rehg, J. M., Loughlin, M., and Waters, K., “Vison for a smart kiosk,” in Computer Vision and Pattern Recognition, June 1997. [65] Robins, G. and Zelikovsky, A., “Improved steiner tree approximation in graphs,” in SODA ’00: Proceedings of the eleventh annual ACM-SIAM symposium on Discrete algorithms, (Philadelphia, PA, USA), pp. 770–779, Society for Industrial and Applied Mathematics, 2000. [66] Seshadri, S., Kumar, V., and Cooper, B. F., “Optimizing Multiple Queries in Distributed Data Stream Systems,” in 2nd IEEE International Workshop on Networking Meets Database (NetDB), in conjunction with ICDE, 2006. [67] Singh, S., Woo, M., and Raghavendra, C. S., “Power-aware routing in mobile ad hoc networks,” in Mobile Computing and Networking, pp. 181–190, 1998. 133
[68] Srivastava, U., Munagala, K., and Widom, J., “Operator placement for innetwork stream query processing,” in Proc. of the ACM Symposium on Principles of Database Systems (PODS), 2005. [69] Stojmenovic, I. and Wu, J., “Broadcasting and activity -scheduling in ad hoc networks,” 2004. [70] Tewari, R., Dahlin, M., Vin, H. M., and Kay, J. S., “Design considerations for distributed caching on the internet.,” in ICDCS, pp. 273–284, 1999. [71] Tseng, Y.-C., Ni, S.-Y., Chen, Y.-S., and Sheu, J.-P., “The broadcast storm problem in a mobile ad hoc network,” Wirel. Netw., vol. 8, no. 2/3, pp. 153–167, 2002. [72] van Dam, T. and Langendoen, K., “An adaptive energy-efficient mac protocol for wireless sensor networks,” in SenSys ’03: Proceedings of the 1st international conference on Embedded networked sensor systems, (New York, NY, USA), pp. 171–180, ACM Press, 2003. [73] Van Renesse, R., Birman, K. P., and Vogels, W., “Astrolabe: A robust and scalable technology for distributed system monitoring, management, and data mining,” ACM Trans. Comput. Syst., vol. 21, no. 2, pp. 164–206, 2003. [74] Viola, P. and Jones, M., “Rapid object detection using a boosted cascade of simple features,” in Proc. CVPR, pp. 511–518, 2001. [75] W. R. Heinzelman, J. K. and Balakrishnan, H., “Adaptive protocols for information dissemination in wireless sensor networks,” in Proceedings of the fifth annual ACM/IEEE international conference on Mobile computing and networking, (Seattle, WA USA), pp. 174–185, 1999. [76] Wan, C.-Y., Campbell, A. T., and Krishnamurthy, L., “Psfq: a reliable transport protocol for wireless sensor networks,” in WSNA ’02: Proceedings of the 1st ACM international workshop on Wireless sensor networks and applications, (New York, NY, USA), pp. 1–11, ACM Press, 2002. [77] Weatherspoon, H. and Kubiatowicz, J., “Erasure coding vs. replication: A quantitative comparison,” in Peer-to-Peer Systems: First International Workshop (IPTPS), 2002. [78] Welsh, M. and Mainland, G., “Programming sensor networks using abstract regions,” in First USENIX/ACM Symposium on Networked Systems Design and Implementation (NSDI ’04), March 2004. [79] Williams, B. and Camp, T., “Comparison of broadcasting techniques for mobile ad hoc networks,” in Proceedings of the 3rd ACM international symposium on Mobile ad hoc networking & computing, pp. 194–205, ACM Press, 2002. [80] Wolenetz, M., Characterizing Middleware Mechanisms for Future Sensor Networks. PhD thesis, College of Computing, Georgia Institute of Technology, August 2005. [81] Wolenetz, M., Kumar, R., Shin, J., and Ramachandran, U., “Middleware guidelines for future sensor networks,” in Proceedings of the First Workshop on Broadband Advanced Sensor Networks, 2004. 134
[82] Wolenetz, M., Kumar, R., Shin, J., and Ramachandran, U., “A simulationbased study of wireless sensor network middleware,” International Journal of Network Management, special issue on sensor networks, vol. 15, July 2005. [83] Wolski, R., Spring, N. T., and Hayes, J., “The network weather service: a distributed resource performance forecasting service for metacomputing,” Future Generation Computer Systems, vol. 15, no. 5–6, pp. 757–768, 1999. [84] Yalagandula, P. and Dahlin, M., “A scalable distributed information management system,” in Proceedings of the ACM SIGCOMM ’04 Conference, (Portland, Oregon), August 2004. [85] Yao, Y. and Gehrke, J., “The cougar approach to in-network query processing in sensor networks,” in Proceedings of SIGMOD, 2002. [86] Ye, W., Heidemann, J., and Estrin, D., “An Energy-Efficient MAC protocol for Wireless Sensor Networks,” in Proceedings of INFOCOM 2002, (New York), June 2002. [87] Yokota, H., Idoue, A., Hasegawa, T., and Kato, T., “Link layer assisted mobile ip fast handoff method over wireless lan networks,” in Proceedings of the 8th annual international conference on Mobile computing and networking (MobiCom ’02), (New York, NY, USA), pp. 131–139, ACM Press, 2002. [88] Younis, M., Youssef, M., and Arisha, K., “Energy-aware routing in cluster-based sensor networks,” in MASCOTS ’02: Proceedings of the 10th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunications Systems (MASCOTS’02), (Washington, DC, USA), p. 129, IEEE Computer Society, 2002. [89] Younis, O. and Fahmy, S., “Distributed clustering in ad-hoc sensor networks: A hybrid,” 2004. [90] Zayas, E., “Attacking the process migration bottleneck,” in Proceedings of the eleventh ACM Symposium on Operating systems principles, pp. 13–24, ACM Press, 1987. [91] Zhang, J., Zhu, M., Papadias, D., Tao, Y., and Lee, D. L., “Location-based spatial queries.,” in SIGMOD Conference, pp. 443–454, 2003.
135