Supporting Concurrent Applications in Wireless ... - Semantic Scholar

Supporting Concurrent Applications in Wireless Sensor Networks Yang Yu, Loren J. Rittle

Vartika Bhandari

Jason B. LeBrun

Pervasive Platforms and Architectures Lab Applications Research Center Motorola Labs

Department of Computer Science University of Illinois at Urbana-Champaign

Department of Computer Engineering University of California, Davis

[email protected]

[email protected]

{yang, ljrittle}@motorola.com Abstract

It is vital to support concurrent applications sharing a wireless sensor network in order to reduce the deployment and administrative costs, thus increasing the usability and efficiency of the network. We describe Melete1 , a system that supports concurrent applications with efficiency, reliability, flexibility, programmability, and scalability. Our work is based on the Maté virtual machine [1] with significant modifications and enhancements. Melete enables reliable storage and execution of concurrent applications on a single sensor node. Dynamic grouping is used for flexible, on-the-fly deployment of applications based on contemporary status of the sensor nodes. The grouping procedure itself is programmed with the TinyScript language. A group-keyed code dissemination mechanism is also developed for reliable and efficient code distribution among sensor nodes. Both analytical and simulation results are presented to study the impact of several key parameters and optimization techniques on the code dissemination mechanism. Simulation results indicate satisfactory scalability of our techniques to both application code size and node density. The usefulness and effectiveness of Melete is also validated by empirical study. Categories and Subject Descriptors: C.3 [Special-purpose and Application-based Systems]: Real-time and embedded systems; C.2.4 [Computer-Communication Networks]: Distributed Systems—Distributed applications; D.1.3 [Programming Techniques]: Concurrent Programming— Distributed programming General Terms: Algorithms, Design, Performance, Experimentation Keywords: wireless sensor networks, Muse, Melete, network protocols, Trickle, virtual machine, Maté, concurrent applications, dynamic group formation, group-keyed code dissemination, group-keyed code distribution

1 Introduction State-of-the-art techniques for wireless sensor networks (WSNs) usually support one application throughout the network. While this is reasonable for dedicated networks, it is not sufficient for commercial deployment of WSNs, where a network is usually shared by several departments with individual purposes. In this case, resource sharing is vital to reduce the deployment and administrative costs, thus increasing the usability and efficiency of the network [2–5]. For example, consider a WSN deployed in an enterprise building. The deployment is shared by multiple departments 1 Melete

is the Muse of Meditation in Greek mythology

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SenSys’06, November 1–3, 2006, Boulder, Colorado, USA. Copyright 2006 ACM 1-59593-343-3/06/0011 ...$5.00

and owners, each with its own applications. The facility department periodically gathers temperature, luminance, and humidity information with a monitoring application deployed on a subset of uniformly distributed nodes. The building owner deploys a structural health monitoring application to nodes located at critical parts of the infrastructure. The security department may need to track moving personnel by deploying a tracking application that automatically migrates to nodes around the targeted person(s). Moreover, when emergent events are detected, applications which help people evacuate the building may need to be launched. In this example, we expect concurrent execution of applications at both node and network levels. At node level, multiple applications may reside on the same sensor node, and need to be executed concurrently. At network level, the applications are deployed onto different (potentially overlapping) portions of the network. Variations in the environment can trigger dynamic re-deployment of the applications, which in turn may require code migration among nodes. Also, network administrators can upgrade application software, which requires code dissemination to corresponding nodes. Thus, we are concerned with three critical challenges in handling concurrent applications on WSNs: (1) reliable code storage and execution at node level, (2) flexible and dynamic application deployment at network level, and (3) consistent and reliable code distribution across the network. These challenges need to be dealt with under a set of features unique to WSNs, including large network scale, stringent constraints on node resources, and need for easy programmability. A naive approach is to compose all applications into one executable image for all sensor nodes. This approach, however, impairs application reliability with shared storage and executable space. The stringent memory constraint on nodes also limits the number of programmable applications. Moreover, it requires global modification of the executable image when an arbitrary application is modified. Concurrent applications are implicitly supported by treating the network as a distributed database [6, 7]. However, the expressiveness of the database approach is limited in various ways. For example, adding a new aggregation operation is a costly task that may modify the query processor on all sensor nodes. Also, applications such as motion tracking cannot be efficiently supported by the database approach. A few research efforts have explicitly addressed concurrent applications in WSNs [3, 8–11]. In Agilla [8], applications are written as mobile agents that can migrate among sensor nodes. Multiple agents may reside on a single node. While Agilla is suitable for applications such as motion tracking, its reliability and efficiency while distributing an application

on a set of nodes with multi-hop connections are not clear. Another agent-based system, SensorWare [9], targets heavierweight platforms than the typical sensor node architecture. The TinyCubus framework also aims to support concurrent applications per network [3]. Sensor nodes are classified into different roles based on their properties [11]. Thus, concurrent applications are supported by partitioning the network into different groups, with each group executing one application. However, node-level support for concurrent applications is not addressed in TinyCubus. The goal of our work is to overcome the shortcomings of the aforementioned techniques and to support concurrent applications with efficiency, reliability, flexibility, programmability, and scalability. Observing that resource sharing in traditional computer networks is realized via processor/network multiplexing using an operating system, we decided to base our approach on the virtual machine (VM) concept of Maté [1], coupled with Trickle [12]. The original goal of Maté is to provide ease-of-programming using TinyScript, a relatively high-level programming language, suitable for event-driven applications; the goal of Trickle is to facilitate efficient code dissemination across a network. The Melete system is within the framework of the Muse research project [13] at Motorola Labs. This project primarily explores middleware approaches to reducing the total cost of owning and operating a multiple-use WSN, with the focus on various issues pertaining to efficient and flexible management of diverse functionality in WSNs. Aligned to the goal of Muse, Melete is developed to support concurrent applications with the following design: 1. Modification to Maté to support concurrent applications at node level. To achieve reliable execution of applications, each application is associated with dedicated storage and execution space. We adopt the TinyScript language with a few extra instructions, inheriting the programmability of Maté. 2. A dynamic grouping technique to support flexible deployment of concurrent applications at network level. Each application is deployed and executed on a subset of sensor nodes that form a group for the application. We refer to this as a group-keyed programming model. Groups may overlap with each other. Also, groups are dynamically formed and adjusted, based on the application requirements and the node status. 3. Enhancement to Trickle to support code dissemination for spatially uneven and temporally changing application deployment. The unbiased and proactive code distribution of Trickle is either infeasible due to stringent memory constraint, or undesirable because it over-kills the code dissemination requirement. We propose a groupkeyed method to reduce the cost of code dissemination by selectively and reactively transporting code to required nodes only. This method is required to support both node-level and network-level application concurrency. The contributions of our work are multi-fold. We design and implement the Melete system for supporting concurrent applications in WSNs. Although based on Maté and Trickle, significant modifications and enhancements are developed to address several unique challenges raised by the group-keyed programming model and code dissemination. We also analyze and optimize several useful techniques specifically designed for code forwarding, including lazy forwarding, progressive

flooding, and randomized code caching. We thus obtain preferred settings for several key system parameters. Extensive simulation results are presented to corroborate our analysis. Moreover, empirical results are obtained to demonstrate the usefulness and effectiveness of Melete. The rest of the paper is organized as follows. We discuss related work in Section 2. This is followed by a brief description of Maté and Trickle in Section 3. In Section 4, we describe the design of Melete in detail. The analysis and optimization of code forwarding are presented in Section 5. We provide implementation details of Melete in Section 6. In Section 7, we present our empirical study. Finally, we discuss future improvements to Melete in Section 8.

2 Related Work 2.1 Programming Models Agent-based programming models have been used in Agilla [8], SensorWare [9], and [10]. The execution of an agent is triggered by either a match in the tuple space for Agilla, or the occurrence of a registered event for SensorWare. This is similar to the context-based event-driven model in Maté, and hence Melete. Mobile agent-based models use push-based code dissemination, which is suitable for monitoring locally changing phenomena. However, it is unclear how to deploy and update code on a set of sparsely distributed sensor nodes using mobile agents. In Melete, we use a pull- and pushbased method to ensure code dissemination to all required nodes. Also, efficient expression of user directions in application deployment is often required in commercial deployments of WSNs. In Melete, this is achieved by issuing grouping instructions from the gateway, without modifying the application code. However, it is not clear how to manipulate tuple spaces in Agilla, or to instruct the Script Manager in SensorWare, to fulfill this requirement. The role assignment algorithm in TinyCubus [3,11] is similar to the grouping technique in Melete. However, since the grouping routines in Melete are treated as a normal Maté application, its execution could be triggered by various events, providing extra flexibility and ease-of-programming. Moreover, Melete differs from TinyCubus by supporting concurrent applications at node level. Similar techniques to the group-keyed model have been studied by several papers in literature [14–16]. For example, in data-centric programming [14], a set of nodes sharing the same state are abstracted as a collaboration group. In abstract region-based model [15], a region is defined for a group of sensor nodes based on their topology or properties. Also, a similar concept, neighborhood programming abstraction, is described in Hood [16]. In Melete, we emphasize the manipulability of group formation by allowing a user to dynamically issue grouping instructions from the gateway (some discussion on aspects of group formation may also be found in our prior work [5]). Besides, works such as Hood and Abstract Regions do not directly address the need for concurrent applications, which is the focus of Melete. Nevertheless, the data sharing and communication primitives proposed in these papers can be integrated into Melete to enhance the programmability of each individual application.

2.2 Information Dissemination and Searching Information dissemination protocols for WSNs have been studied in Trickle [12], PSFQ [17], Deluge [18], and Impala [19]. However, these protocols all aim to disseminate the same piece of information to the entire network, whereas

Melete focuses on a group-keyed code dissemination. Tailored to this purpose, major enhancements of Melete over Trickle include passive code dissemination and code forwarding. The role-based code dissemination algorithm in TinyCubus [3] is probably most comparable to our work. This push-based algorithm performs consecutive flooding by each group member with a pre-specified range that reflects the expected minimum distance between group members. This technique is designed for a relatively stable group formation: With dynamic grouping, it may lead to unnecessary flooding or unsuccessful code delivery. Also, how to update the code of a newly joined member is not clear from the paper. The group-keyed code dissemination is also different from multicast [20, 21]. This is because nodes can join a group, and thus need to update their code, on the fly. Using prior multicast techniques, the only way to do so is to periodically multicast to the entire network. In Melete, the newly joined node can request code from nearby group members. To identify a group member, we need certain searching techniques, which have been studied for various applications in wireless ad hoc networks. Many existing search strategies can be captured by the n-ring model [22, 23]. In the n-ring model, a search consists of n consecutive attempts with an ever expanding search radius. Since all attempts are initiated from the requesting node, each attempt has to re-search the area that has already been covered by previous attempts. To overcome this inefficiency, we propose a progressive flooding strategy, which autonomously increases the search radius without intervention of the requesting node. This is realized by leveraging the periodic broadcast of Trickle, which is different from the one-shot broadcast in the n-ring model. Progressive searching method is also studied in ACQUIRE [24] and rumor routing [25]. However, the unicast in ACQUIRE and rumor routing needs unique node identification, which is not required by Melete. Moreover, all these works (n-ring, ACQUIRE, rumor routing) are based on one-shot broadcast, which may be insufficient for error-prone wireless communication. Melete inherits periodic broadcast from Trickle to ameliorate this problem. All the above search strategies employ flooding to a certain extent for information dissemination. Various flooding strategies have been studied to overcome the broadcast storm problem [26]. The progressive flooding in Melete is based on Trickle, which is a repetitive version of the counter-based scheme [26]. Nevertheless, other flooding strategies can also be used to implement the progressive flooding. The on-line caching policy studied in this paper is motivated by techniques proposed for peer-to-peer networks [27, 28] and WSNs [29]. However, the underlying information request model and associated cost function are quite different.

context, and shared variables for all contexts. Also, various hardware device related tasks (i.e., sensing, broadcasting) are buffered in corresponding queues for the device.

3 Background 3.1 Maté

4.2.1 Storage of Code Images

Maté is a TinyOS-based virtual machine (VM) suitable for event-driven programming [1]. An application is executed when an execution context is invoked by an event. Supported contexts include Reboot, Once, Broadcast, Timer, and Trigger. An application is composed as a set of TinyScript code, one per context. Since code for all contexts share the same execution space, they are regarded as in the same application. The code is executed using a stack-based architecture. The execution space includes a stack and local variables for each

3.2 Trickle Trickle is a controlled flooding technique for information dissemination in a WSN. A key component of Trickle is a periodic, counter-based broadcasting scheme. Ideally, in each period, only m broadcasts are performed for all sensor nodes in a one-hop connection, where m is a pre-specified parameter. The information to disseminate by Trickle is the application code for Maté. Each piece of code for a context is called a code capsule. A capsule may consist of several code chunks, with each chunk fit into one TinyOS packet. To facilitate code update, Trickle defines three states: MAINTAIN, REQUEST, and RESPONSE. When in the MAINTAIN state, a sensor node periodically advertises version information for its code capsules. If a node detects a newer code in its neighbors, it switches to the REQUEST state, and periodically advertises request packets (or simply requests), which are essentially bitmaps of the required code chunks. If a node receives a request, it switches to the RESPONSE state, and responds by broadcasting the required chunks (or simply responses). The broadcast of both requests and responses follows the controlled flooding technique. The periodic advertisement in the MAINTAIN and REQUEST states is driven by a version timer. The version timer is set to a highest firing rate when a newer code is discovered. This rate decreases exponentially over time, till it reaches a pre-specified lowest rate. The response in the RESPONSE state is driven by a capsule timer with a constant firing rate.

4 System Design 4.1 Assumptions We consider a sensor network connected to a gateway where grouping instructions and application code are injected. We make very few assumptions about the underlying network: 1. The network is connected. 2. The gateway stores code for all applications. 3. All sensor nodes support omni-directional broadcast. We make no assumptions about unique node identification, networking protocol, communication reliability, localization, or time synchronization. However, we may make additional assumptions in Section 5 for analysis purpose.

4.2 Node-Level Support In brief, to support concurrent applications on a sensor node, it is required to store code for applications, create and maintain a dedicated execution space for each application, and separately compile the code for each application to avoid variable sharing across applications. In the following, we discuss our design based on the TelosB platform [30]. Resource constraints for hosting multiple applications on a sensor node include memory, CPU, and communication capabilities. Since most applications are expected to be low duty cycle, we only consider memory constraints in this paper. There are three data storage devices on a TelosB node: a 10 KB RAM, a 48 KB ROM (internal flash), and a 1 MB external flash. This provides us with several design options. In Deluge, all application code images are stored in the external flash, and loaded into the ROM when needed. After being loaded, an application is usually executed exclusively

for a long period of time. However, in our case, the execution of multiple applications is interleaved, with a potentially high switching rate. This difference in execution model makes the Deluge design unsuitable, due to the high time and energy overheads to access the external flash. This also leaves us the options of storing application code on either the RAM or ROM. In our case, the VM code is large enough to occupy the entire ROM on the TelosB platform. Thus, we choose to store application code on RAM. Note that the actual choice between ROM and RAM is not crucial. The key point is that given the stringent memory constraint on a node, it is often infeasible to store code for all applications. In our implementation, the first application with 5 contexts (Reboot, Once, Broadcast, and two Timer contexts) requires approximately 3.1 KB memory, and each additional application with the same 5 contexts requires approximately 1.46 KB. The 10 KB RAM on a TelosB node supports up to 5 concurrent applications. In this case, the unbiased code dissemination of Trickle is unsuitable, since it may waste memory space with unnecessary application code. Even if the memory space is sufficient, it is still undesirable to blindly store code for all applications on every node. This naive policy wastes precious energy by proactively distributing code throughout the network, regardless of the actual code requirements. These facts motivate our selective and reactive code dissemination method in Section 4.4. Moreover, it is worth mentioning that a memory hierarchy can be created using the RAM/ROM and the external Flash, which mimics the traditional virtual memory architecture. This involves more complex memory replacement and coherence policies, which is one of our future directions.

4.2.2 Protection of Application Execution Space Similar to the code images, we also store the execution space for each application in RAM. This includes one copy of shared variables and lock information for the entire application, and separated copies of context information, execution stack, and private variables for each context in the application. To protect execution space, we prohibit variable sharing among different applications. The advantage of this is that the rebooting of one application (e.g., after code updating) does not affect the state and execution of other applications. The disadvantage is that it becomes inconvenient for coordination and information sharing between applications. When an application is rebooted, its execution space is reset to the initial state. Also, the corresponding elements in the task queues of the CPU, networking, and sensing interfaces are removed.

particular application are organized into one logical group. A sensor node can belong to multiple groups, referred to as associated groups. Each associated group corresponds to exactly one application (thus the terms group and application are exchangeable hereafter).2 The members of multiple groups may overlap with each other. Also, a sensor node can dynamically join or leave groups per application and user requirements, referred to as dynamic grouping. To deploy an application with dynamic grouping consists of two consecutive stages: to group sensor nodes for the application and to disseminate code onto the group members. These two stages can be performed either in one shot or incrementally over time. The first stage itself is regarded as a routine of a “grouping” application. Thus, we designate group 0 for this grouping application, and enforce all nodes to be a member of group 0 during their entire lifetime. The code for group 0 is always stored on every node, and can be executed under various contexts as a general Maté application. For example, we can form a group by executing code in the Once context, and periodically re-organize members of the group through the Timer context. The criteria for a sensor node to join a group is based on the sensed data, the properties of the node, and information the node gathers from its neighbors. Since the grouping application is programmed using TinyScript, we gain both programmability and flexibility. We illustrate this using two real-life scenarios. S1: Consider an application A that monitors the temperature in every room of a building. It is natural to form a group with one sensor node per room (we assume that each node knows the room it is located). We can write a Once context for group 0 such that each node gathers group information from nodes in the same room and joins group A if no other nodes in the same room belongs to A. S2: Consider an application B that runs a sophisticated algorithm to monitor cracks on a wall. We assume that the cracks cause abnormal vibration of the wall around it. All sensor nodes can periodically check the wall vibration in the Timer context of group 0, and join the application B if the vibration indicates the expansion of cracks to their location.

4.3 Dynamic Grouping

In both scenarios, the code for applications A and B is transported on-demand to the new members. More features can be added in these scenarios. For instance, we may use a random delay in S1 to avoid two nodes in the same room simultaneously joining group A, or use the current battery-level of nodes as a grouping criterion. Also, nodes in S1 need to periodically check the status of group A, and elect themselves into the group if necessary. Our dynamic grouping model has the advantages of (1) supporting concurrent applications sharing the underlying network resources, (2) providing flexible and on-the-fly deployment of applications based on contemporary status of the sensor nodes and the monitored object, and (3) improving resource utilization via on-demand program downloading. In our implementation, the number of groups coexisting in the network is bounded to 16, and the number of simultaneously associated groups for a node is constrained by its RAM capacity (e.g., up to 5 applications on a TelosB node). Considering the weak processing and communication capabilities of

To support concurrent applications within a network, we develop the concept of dynamic grouping of sensor nodes. More specifically, a set of sensor nodes that need to host a

2 In general, an application may require multiple groups to be realized. Without modification, our work supports such requirements.

4.2.3 Modification to the Compiler Toolchain The current compiler toolchain of Maté does not distinguish between different applications. We enhance it by maintaining a separated copy of the latest code for each application to (1) avoid variable sharing among applications, and (2) maintain independent version number for each application. Also, by inheriting the TinyScript language from Maté, Melete achieves ease of programming for application domain experts. The small code size of TinyScript-based application code also enables more efficient code dissemination, compared to nesC-based application code.

sensor nodes, we believe this to be reasonable for most reallife scenarios. Otherwise, load balancing among sensor nodes in proximity or deploying extra nodes are useful techniques to alleviate the problem. More details of the above dynamic grouping can be found in our prior work [5]. Also, it is possible to incorporate the role assignment-model [11] into our dynamic grouping. Thus, nodes within the same group may have different roles.

4.4 Group-Keyed Code Dissemination Code dissemination is required when group members are dynamically adjusted, or application code needs to be upgraded with newer version. The goal of our group-keyed method is to (1) selectively distribute application code to only associated sensor nodes, and (2) reactively distribute code only when it is required. The first goal overcomes the stringent memory constraint of sensor nodes: only code for associated groups are stored on sensor nodes. This is important when there are a large number of applications running in the network, with every sensor node hosting a few. The first goal also reduces communication costs by avoiding transporting code to nodes unnecessarily. This is particularly useful when the network scale is large, but group members constitute a small fraction of the total population. The second goal delays the time of code transportation until the exact moment the code is required. This also reduces unnecessary communication costs resulting from proactive protocols, e.g., Trickle. However, the reactive method may incur high delay in code transportation. Thus, for mission critical applications, proactively transporting code to hosting sensor nodes may still be necessary. These two goals distinguish Melete from the unbiased and proactive Trickle, in terms of both functionality and performance. To achieve these goals, we need to address a set of challenges, including (1) on-demand code dissemination, (2) code transportation from nodes multiple hops away, and (3) network traffic minimization. Accordingly, we propose solutions including passive code dissemination, code forwarding, and progressive flooding.

4.4.1 Passive Code Dissemination We propose a passive code dissemination policy with active advertisements. Specifically, version information of all groups is disseminated throughout the network and maintained by all sensor nodes, while code is passively disseminated only when it is requested by certain nodes. Since version packets are usually smaller than code packets, this policy aims to minimize network traffic overhead while keeping all sensor nodes up to date without large delay. Specifically, each node maintains the version information of all applications that it has heard of. Each node advertises its version information for all groups in a round-robin fashion. Whenever a node receives newer version information about a group, it updates its local data, and sets its version timer to the highest rate. Similar to Trickle, this rate decreases exponentially to a pre-specified lowest rate. If the received information is for an associated group, the node switches to the REQUEST state, and advertises its request for the new code It is possible to combine version information of multiple groups into one packet to achieve higher advertising rate per group. This is considered as a future direction of our work. The key difference between Melete and Maté’s Trickle is that Trickle allows nodes to advertise version information only after receiving the code, while Melete allows the propagation of version information without sending the actual code.

4.4.2 Code Forwarding Trickle supports code transportation between one-hop neighbors, which is insufficient for sparsely distributed group members. We construct a multi-hop region between the requesting node (or requester) and potential responding nodes (or responders) so that both requests and responses can be transmitted across the region, referred to as a forwarding region. To do so, we add one more state, FORWARD, into Trickle. A sensor node switches to the FORWARD state if it “believes” that its neighbors need help to get their code, and stays in the FORWARD state until one of the following three events: timeout, the requests are fulfilled, or there is a need to switch to the REQUEST or RESPOND states. This implies that the REQUEST and RESPONSE states have higher priority than the FORWARD state. Nevertheless, such a bias can be altered if so desired. A set of sensor nodes in the FORWARD state (or forwarders) constitute a forwarding region. Each forwarder is capable of caching one request and one code chunk in its forwarding cache. A forwarder broadcasts the cached request using the Trickle protocol, as if it is a normal requester. When the forwarding region expands to a responder, the responses are transmitted throughout the forwarding region. Each forwarder caches the most recently received chunk and forwards it using Trickle, as if it is a normal responder. Nodes outside the forwarding region discard any received responses. As in Trickle, each forwarder maintains a bitmap, indicating the remaining chunks that need to be forwarded. To reduce traffic, the bitmap of a node is also updated upon reception of forwarded responses in the neighborhood. When the bitmap indicates no more chunks to forward, the forwarder switches to the MAINTAIN state. However, two problems remain unresolved in the above mechanism. First, neighbors of a requester may blindly start forwarding even though some other neighbors of the requester can resolve the request. Second, the forwarding region can unnecessarily expand to the entire network even though a nearby responder is discovered. We discuss two techniques to solve these problems: lazy forwarding and progressive flooding. We cover lazy forwarding in this subsection and explain progressive flooding in the next subsection. From a prospective forwarder’s perspective, it should offer help to the requester only when it was truly needed (i.e., onehop neighbors of the requester did not resolve the request). This is difficult because either the requests or responses can be lost or corrupted over the wireless channel. To handle this problem, we propose an approach called lazy forwarding. The main idea is to allow enough time for neighbors of a requester to respond before starting a forwarding process. Specifically, each time a node receives a request, it switches to the FORWARD state with a certain probability, Pf . Pf increases with the number of received requests. It is similar to a human life scenario: after hearing multiple shouts for help, a person is more assured that someone is in real trouble. Thus, we have Pf = f (q),

(1)

where q is the number of requests received so far, and f is an increasing function with a range within [0, 1]. We discuss our choice of f (q) in Section 5. We depict the state transitions between the FORWARD and other states in Figure 1. In the figure, the requests and responses relayed in the FORWARD state refer to those of the group that triggers the FORWARD state. Also, P f is calculated every time a request is received.

% !"#$ ! && & '

! & &

%

%

Figure 1. State transition in Melete (states and transitions in dotted line are from the original Trickle) To save memory space, we enforce that any node can be in the FORWARD state for exactly one code capsule (which can correspond to multiple, simultaneous requesters). Thus, we only need to maintain one copy of the state information in the memory. However, this strategy prohibits overlapping of forwarding regions for different capsules. We examine its consequence through simulations in Section 5.4.3.

4.4.3 Progressive Flooding To prevent a forwarding region from expanding to the entire network, we propose a progressive flooding technique, a special n-ring model tailored to the periodic broadcasting of Trickle. For each forwarded request, we associate a time-tolive (T T L), in terms of hop counts. Let H denote the largest value of T T L. In the n-ring model, a request with T T L = 0 is discarded by a receiver. However, in the progressive flooding, a receiver keeps counting received requests with T T L = 0, and switches to the FORWARD state using lazy forwarding. After becoming a forwarder, the receiver advertises the requests with T T L = H − 1, i.e., requests with T T L = 0 are reincarnated. For nodes receiving requests with T T L > 0, they switch to the FORWARD state immediately, and forward the requests with a decremented T T L. Using this method, the requester broadcasts its requests with T T L = 0. Its neighbors switch to the FORWARD state using lazy forwarding, and forward the requests with T T L = H − 1 afterwards. An initial forwarding region of Hhop width is formed when the T T L of the forwarded requests is decremented to 0. If no responder is discovered in the initial forwarding region, based on lazy forwarding, the forwarding region expands to nodes H + 1-hop away from the region. In this manner, the forwarding region expands using ripple-style, until a responder is discovered. If at least one responder is discovered by a forwarder, the forwarder sets T T L = −1 for all forwarded requests, and restores to the highest advertising rate. So do all forwarders receiving requests with T T L = −1. Thus, requests with T T L = −1 propagate quickly (faster than the ripple-style expansion) throughout the forwarding region. All non-forwarding nodes receiving a request of T T L = −1 simply drops the packet. Thus, the forwarding region stops expansion. Meanwhile, the responses are forwarded to the requester. While in similar spirit to the n-ring model [31], the progressive flooding inherits and leverages the periodic broad-

casting of Trickle. Our key idea is to autonomously expand the forwarding region using lazy forwarding, whereas in the n-ring model, the requester initiates each round of searching. Thus, lazy forwarding coupled with progressive flooding shifts the task of expanding a forwarding region from the requester to the forwarders. We quantitatively analyze the performance improvement of progressive flooding over the nring model in Section 5. Since we assume no unique node identification, the forwarding of responses is based on Trickle, instead of a one-toone communication from the responder to the requester. Although our Trickle-based method inevitably causes larger network traffic, it provides reliable code forwarding (i.e., similar to multi-path routing).

5 Analysis and Optimization Since the performance of Melete is largely determined by the code forwarding technique, we analyze the code forwarding with respect to a set of key parameters, including the P f function and the T T L of forwarded requests. We also propose a code caching technique to further optimize code forwarding. Table 1 summarizes the used notations. q H r U p L z Q m, m1 , m2 Ri Ci Gi ρi δi Fi u ηi

# of received requests T T L of forwarded requests, in hop counts Radio range of sensor nodes Radius of the sensor field in unit of r # of chunks in the requested capsule Radius of the sensor field in rH, L = d U He # of responders Expected waiting time using lazy forwarding Trickle parameters The i-th forwarding ring from the center # of forwarded requests in time [iQ, (i + 1)Q] A group, i = 1, . . . , g # of members of Gi # of requests for Gi Expected traffic to fulfill a request for Gi Upper bound of caching nodes in the field # of caches for Gi Table 1. List of notations

5.1 Traffic Pattern under Trickle We first model the network traffic pattern under Trickle. In Trickle, all nodes in a circle of radius r are expected to broadcast for at most m times in a specific time interval, where m is a pre-specified parameter3. This property leads to a linear model of the network traffic based on the area covered by the nodes. Specifically, for an advertising rate of one packet per mA k time units, nodes in an area of size A broadcast πr 2 packets every k time units. Consider the case where nodes in an area of size A start broadcasting from the highest communication rate that decays exponentially over time. Without loss of generality, we assume the highest rate to be 1 per time unit. Also, we assume a strictly synchronized node behavior, i.e., nodes in a circle of radius r compete for m broadcasts in time durations [1, 1], [2, 3], [4, 7], . . .. Thus, for a time duration Q, m logQ πrA2 packets are broadcasted in total. 3 Although m is a logarithmic function of node density in practical

scenarios, this does not affect the results of our analysis. We examine the effect of this property of m in our simulations.

5.2 Choosing Pf Based on (1), Pf increases with q, the number of received requests with respect to an unassociated group. We choose a simple polynomial function for f : f (q) = cqa . Intuitively, we expect the first few received requests to have larger impact on Pf than the requests received later. In f (q) f (q) other words, we desire limq→0 ∆ ∆q → ∞ and limq→∞ ∆ ∆q → 0. We observe that both conditions are satisfied when choosing a between (0, 1). Another way to understand it is that useful information from a newly received request diminishes with q, hence the concaveness of f (q). For our purpose, we simply set a = 12 . In addition, c is a tunable parameter to adjust how fast f (q) saturates to 1. Throughout this paper, we set c = 1/3, meaning that a sensor node switches to the FORWARD state after receiving at most 9 requests. Thus, we have the following function P f : √ Pf = min(1, 0.33 q) . (2) The expected number of requests to be received for a node to switch to the FORWARD state can be calculated as ¯q = √ q−1 ∑9q=1 ∏i=1 (1 − 0.33 i − 1) ≈ 2.25. Let Q denote the expected waiting time. Assuming the requests are broadcasted from the highest rate that decays exponentially over time, we have Q = 2 ¯q≈ 5 time units.

5.3 Impact of H

H is a key parameter to determine the behavior of code forwarding. Specifically, H affects both the time delay to obtain the code and the incurred network traffic in terms of forwarded requests and responses. Using a simplified model, we analyze the impact of H on network traffic and time delay. We also verify our analysis using simulation results.

5.3.1 Impact on Network Traffic For tractable analysis, we study a circular sensor field with uniformly distributed sensor nodes. Each node can communicate with any other nodes within radio range r. Let U denote the radius of the field in unit of r. We assume a sufficiently dense network so that a packet can reach all sensor nodes within distance kr in k hops. We assume that the requester locates at the center and z responders are uniformly distributed in the field. We also assume that the requested code capsule consists of p chunks. Based on the lazy forwarding, the forwarding region can evolve to an arbitrary shape. However, we assume that the expansion of the forwarding region forms a series of concentric rings around the requester. This assumption reflects an expected, synchronized behavior of the flooding as if we set Pf (q) = 0 for q < ¯ q and Pf (q) = 1 otherwise. In other words, after a new ring is formed, the forwarding region remains stable for time duration Q before further expansion. Each ring has a width of rH. Let L denote the radius of the sensor field in unit of rH, i.e., L = d U H e. We denote the rings formed by time (i + 1)Q as {R1 , R2 , . . . , Ri }, i = 1, . . . , L. If any responders are discovered in Ri , the flooding of requests completes and the forwarding of responses starts. Otherwise, a new ring Ri+1 starts forming at time (i + 1)Q + 1, e.g., ring R1 starts forming at time Q + 1. Therefore, the process of code forwarding consists of two stages: the discovery stage for propagating the requests and the forwarding stage for transporting the responses. We first model the traffic in the discovery stage. Let Ci denote the number of requests forwarded in time duration [iQ, (i + 1)Q].

We know that the advertising rates decreases exponentially from Ri to R1 . We assume a sufficiently small lowest rate so that it suffices to approximate Ci by the network traffic generated in a few outermost rings. For our purpose, we consider two outermost rings: Ri and Ri−1 . This approximation is close because even if L is large, the area of the two outer rings is proportionately larger than that of all the inner rings. Since Q is usually much smaller than the period of the lowest advertising rate, we assume that nodes in rings Ri and Ri−1 do not reach the lowest rate during [iQ, (i + 1)Q]. Based on the traffic pattern in Section 5.1, each node in Ri has the chance to compete logQ broadcasts with its one-hop neighbors in [iQ, (i+1)Q], while each node in Ri−1 competes exactly once. Taking m into consideration, we have: Ci = mH 2 ((2i − 1) logQ + (2i − 3)) .

(3)

The periodic broadcasting in the discovery stage implies that the traffic is minimized by reducing either the number of forwarders or the time duration. The first option prefers a smaller H, while the second favors a larger H. The exact impact of H is determined by the distribution of the responders, especially the location of the nearest one. Let a random variable X denote the distance of the nearest responder from the requester. Consider the event that the forwarding region expands to Ri at time iQ. This event occurs if and only if no responders is discovered in rings {R1 , R2 , . . . , Ri−1 }. Thus, we can calculate the expected traffic in the discovery stage as L

Cd (H) = ∑ Ci Pr{X > (i − 1)rH} ,

(4)

i=1

2

(i−1) z where L = d U H e and Pr{X > (i − 1)rH} = (1 − L2 ) . 2 To simplify the notations, let A = m1 H and P(i) = (1 − i2 z ) , where m1 is the value of m for requests. For ease of L2 analysis, we assume that U is a multiple of H. We have

Cd (H) L

= A ∑ (((2i − 1)(logQ) + (2i − 3))P(i − 1)) i=1

≈ m1U(

√ U(logQ + 1) H π(log Q − 1)Γ(z + 1) + )(5) z+1 2Γ(z + 32 )

We verified through simulation that the above approximations are close for z ≥ 2 (it is reasonable to believe that the case for z = 1 is very rare for real-life applications). Details of the derivation can be found in [32]. We now model the traffic generated in the forwarding stage, in terms of forwarded chunks. Since this stage uses a one-shot broadcast-based Trickle, the traffic in ring Ri is simply m2 pH 2 (2i − 1), where m2 is the value of m for responses. Thus, we can derive the traffic in the forwarding stage as: √ H πΓ(z + 1) U + C f (H) = m2 pU( ). (6) z+1 2Γ(z + 32 ) It is observed that both Cd (H) and C f (H) are minimized when H = 1, or equivalently L = U. As a comparison, the network traffic of using the n-ring model under Trickle is: L

Cn−ring (H) = mH 2 (p + logQ) ∑ i2 P(i − 1) . i=1

(7)

100 50 0 1

2

4 H

8

15 10 5 0 1

16

(a) Forwarded chunks versus H

20

2

4 H

8

250 200 150 100

16

(b) Completion time versus H

Figure 2. Numerical results on impact of forwarding hops, H (U = 16, m1 = 1, m2 = 2, Q = 8, p = 2, x-axis in logarithmic scale) In Figure 2(a), we demonstrate numerical results of C f (H) (Cd (H) is omitted since it differs only in the constant coefficients). We set U = 16, m1 = 1, m2 = 2 (default settings of m1 and m2 in Trickle), Q = 5, p = 2, while varying z in {4, 8, 16, 32} and H in {1, 2, 4, 8, 16}. From the figure, we see that the network traffic is minimized when H = 1. Also, as expected, the network traffic decreases with z.

5.3.2 Impact on Time Delay Let Td (H) denote the expected time delay to discover a responder with H (referred to as the discovery time hereafter). Td (H) is modeled as: √ L QU πΓ(z + 1) . (8) Td (H) = ∑ (QP(i − 1)) ≈ H 2Γ(z + 23 ) i=1 Similarly, the time to propagate the p code chunks back to the requester is modeled as T f (H) = Td (H) pH Q . Moreover, the entire time to complete a code forwarding, or the completion time, is Td (H) + T f (H). It is observed that both Td (H) and T f (H) are inversely proportional to H. We illustrate numerical results on the completion time in Figure 2(b). We now jointly consider Figures 2(a) and 2(b) for a balanced network traffic and time delay by tuning H. We observe that slightly increasing H can effectively trade energy for latency. For example, when z = 16, increasing H from 1 to 4 results in a 75% reduction in time delay, at the cost of a 30% increase in the network traffic for discovery stage and 56% increase in the forwarding stage. This tradeoff is particularly useful for applications with real-time constraints.

5.3.3 Simulation Results We used TOSSIM [33] to study the behavior of progressive flooding in a realistic environment. Our version of TOSSIM simulated the CC1000 radio on the Mica2 platform. We set m1 = 1 and m2 = 2. Also, the highest advertising rate of requests was once per second, and the lowest rate was once per minute. The broadcast rate of responses was fixed at once per second. We use these settings throughout the rest of the paper, unless otherwise stated. Packet-Level Simulations: We first used a packet-level TOSSIM to simulate large-scale networks. A 33 × 33 grid deployment with a 15-foot spacing was generated using the TOSSIM’s empirical model [34], with the radio transmission range set to 50 feet. The packet-level TOSSIM also models

# of forwarded chunks

150

z=16 z=32 z=64 z=128

50 0 1

2

4 H

8

150 100

10 8 6 4 2

4 H

8

16

50 0 1

16

12

2 1

200

Completion time (Sec)

200

25

z=16 z=32 z=64 z=128

Discovery time (Sec)

250

Completion time


300

packet collision when a node receives multiple packets simultaneously (regardless of signal strength). A single requester was located at the center of the network, while z responders were uniformly distributed in the network. # of forwarded requests

For large z, we consider a close optimal solution of the n-ring model by setting H = 1 [22]. With z ∈ [20, 40], numerical results indicate an 2 to 3-fold reduction in network traffic by the progressive flooding (H = 1), compared to the n-ring model.

2

4 H

8

16

z=16 z=32 z=64 z=128

20

15

10 1

2

4 H

8

16

Figure 3. Packet-level simulation results for code forwarding (1089 nodes with one requester and z responders) In Figure 3, we illustrate the network traffic and time delay averaged over 200 runs. We observed that the trend of curves and the tradeoffs between network traffic and time delay confirmed our analysis. For example, when z = 16, increasing H from 1 to 4 resulted in a 35% reduction in both discovery time and completion time, at the cost of 22% increase in forwarded requests, and 54% increase in forwarded chunks. The differences in the exact number of network traffic and time delay indicate that the complex system behavior is not fully captured by our model. For example, our analysis does not model packet error, and assumes a strictly synchronized behavior of nodes. Both under-estimate the network traffic. Impact of Packet Collisions: We also used a bit-level, CSMA enabled TOSSIM to further examine the impact of packet collisions. Due to the large running time of TOSSIM, we downscaled the network to a 17 × 17 grid, while keeping the radio model unchanged. The resulting network traffic and time delay exhibited very similar trend to that in Figure 3. However, network traffic increased faster with H in bit-level TOSSIM. Also, the curves for time delay were less steep. These could be explained as a result of higher packet collision modeled by the bit-level TOSSIM. With larger H, a greater forwarding region tended to be formed, resulting more severe packet collision. Details of the simulation results are omitted due to space limitations, and can be found in [32].

5.4 Randomized Code Caching Results in Section 5.3 suggest that an effective method to reduce the cost of code forwarding is to increase the number of responders. For this purpose, we further utilize the forwarding cache to replicate code chunks in the network. Specifically, besides storing the code for associated groups, each sensor node can cache one extra chunk in its forwarding cache. All locally stored code chunks are used to respond to a matched request. Due to the limited size of the forwarding cache, a key question is how to distribute the code chunks so that the overall cost of code forwarding for all groups is minimized. To

answer this question is challenging since there is no formal model to describe the distribution and dynamics of group formation in the network. As an initial step, we start with a static problem setting to gain insights for an on-line policy.

5.4.1 Static Analysis We define a static cache distribution problem based on a snapshot of the system. We consider a circular field with n uniformly distributed sensor nodes. Each node is capable of caching one extra code chunk in addition to the code of associated groups. We assume there are g groups {G1 , G2 , . . . , Gg } in the snapshot. For each group Gi , i = 1, . . . , g, ρi members are uniformly distributed in the field. To model the fact that one code capsule may consist of multiple chunks, we assume that among the n sensor nodes, only u < n of them are capable of code caching, and each capsule consists of exactly one chunk. Thus, nu is the expected number of chunks for all capsules in the original problem. We assume that in this snapshot, the code of each group Gi , i = 1, . . . , g is requested by δi nodes that are uniformly distributed in the network. For each group Gi , a requester is expected to obtain the code from the nearest responder that either is a member of Gi or caches the code of Gi . Let ηi denote the number of nodes that cache code for Gi . The total number of copies for code of Gi is then ρi + ηi . The expected network traffic to locate and transport the code of Gi is approximated by the sum of (5) and (6) with z = ρi + ηi . From Section 5.3, such a traffic is minimized when H = 1; let Fi (ηi ) denote the minimum. The expected network traffic to fulfill all requests for Gi is approximated as Fi (ηi )δi . Our goal is to find a vector η = η1 , . . . , ηg such that the sum of expected network traffic over all groups are minimized. Precisely speaking, we are interested in finding a η so as to g

minimize

∑ Fi (ηi )δi

(9)

∑ ηi ≤ u

(10)

i=1 g

subject to

i=1

ηi ≥ 0,

i = 1, . . . , g

(11)

Since Fi is a decreasing function of ηi , the optimal solution shall have ∑gi=1 ηi = u. Using Lagrangian relaxation, the optimal solution also follows = −λ, for ηi > 0 (12) Fi0 (ηi )δi ≥ −λ, for ηi = 0 , (13) where Fi0 (·) is the first derivative of Fi (·) and λ > 0 is a Lagrangian multiplier. While this problem can be numerically solved, we study an approximated solution for more insights. For ease of analysis, we focus on scenarios where ηi > 0 for all i’s in the optimal solution. When z is sufficiently large, we approximate Γ(z+1) 3 Γ(z+ 2 )

h(z) z+1 ,

where h(z) is a weak function of z. Numerical using results show that h(z) increases from 3.5 to 10 as z varies from 10 to 100. Hence, we may simplify Fi as c Fi (ηi ) ≈ , (14) ρi + 1 + η i where c = U 2 (m1 (logQ + 1) + m2 p) + h(z)UH 2 1) + m2 p).

√

π

(m1 (log Q −

We then derive an approximated η as follows: √ (u + g + ∑gi=1 ρi ) δi ηi = − (ρi + 1) . g √ ∑i=1 δi

(15)

Equation (15) indicates that the number of caches for an application should be proportional to the square root of the number of requests, and decreases with the number of existing responder. This is similar to the SQUARE-ROOT replication strategy in peer-to-peer networks [28], and WSNs [29]. However, our problem is formulated from a different domain, with different searching method and associated cost.

5.4.2 On-line Caching Policy

Because it is difficult to gather the information about δi ’s and ρi ’s in an on-line fashion, to realize the optimal policy in a distributed method is challenging. The path replication described in [28] sets the caching size of a request to be its query size, i.e., the number of nodes queried before discovering a responder. In our case, the number of forwarders can be regarded as the query size. This motivates a simple caching policy: Each forwarder caches the lastly forwarded code chunk. Intuitively, the number of caches allocated to a group by the optimal policy is sub-linear with its number of requests. That is, the optimal policy is “in between” the uniform and proportional policies [28], where the uniform policy allocates the same number of caches per group, and the proportional policy allocates caches linearly with the number of requests. Consider ∑i δi requests spanning over a time period, T . Given a group Gi with ρi members (responders) uniformly distributed in the network, the expected distance from a uniformly distributed requester to a closest responder is given by q di = 0.5 ρiS+1 [35], where S is the area of the sensor field. Thus, the expected number of forwarders to fulfill the request is B = απdi2 = 0.5απS ρi +1 , where α is the node density. These forwarders will cache the code for Gi immediately after the forwarding process. Therefore, the number of caches for a group increases with its requesting frequency over T . Also, the increase is sub-linear over time as the number of responders (the dominant of B) increases during the process. Moreover, since the forwarding cache is overwritten after every forwarding process, we expect the cached code to be uniformly distributed across the network after a large T , albeit the fact that each forwarding process causes localized code caching. Overall, we expect the on-line caching to self-adapt to the requesting frequency and existing number of responders.

5.4.3 Simulation Results We generated a network of 400 nodes regularly deployed on a 20 × 20 grid. Initially, four groups were deployed in the network, with number of members, ρi = 20 for i = 1, 2, 3, 4. All group members were uniformly distributed in the network. The number of requesters for each group, δi , was randomly chosen from [20, 40]. These requesters were uniformly distributed in the network with the requesting time uniformly distributed within 10 minutes. To investigate the impact of u, we varied the capsule size, p, from one to four chunks per capsule. Also, we set H = 1. Effect of Randomized Code Caching: In Figure 4, we illustrate the averaged network traffic and completion time to fulfill each requester in two scenarios: with and without caching. The discovery time is omitted to accommodate the comparison between the two scenarios. The illustrated data was averaged over 200 runs of the packet-level TOSSIM.

80 No caching Caching

0 1

2 3 # of chunks per capsule

60

40

20 No caching Caching 0 1


4

40

20

0 1

4 Ratio of caching vs. no caching

40

60

No caching Caching 2 3 # of chunks per capsule

4

1 0.8 0.6 0.4 1

# of status # of chunks Completion time 2 3 # of chunks per capsule

4

Figure 4. Performance improvement by using randomized code caching (400 nodes, 4 groups with 20 members each, 20 to 40 requests per group in 10 minutes) 0.2

2500 2000 1500 1000 500 0

1

0.16 0.12

3 Scenarios

4

5

4

5

4

5

400

0.04 0

2

(a) # of requests

0.08

100

200 300 # of forwarded requests

>400

Figure 5. PDF of the # of forwarded requests (400 nodes, 4 groups with 20 members each, 20 to 40 requests per group in 10 minutes, with caching)

# of chunks

Probability

long tail. The PDFs of the number of forwarded chunks and the completion time showed similar patterns, for p = 1, 2, 3, 4. Comparison to Trickle: We compared Melete to Trickle, with respect to code dissemination for a particular group. We designed 5 scenarios to model various group member distribution. In the first 4 scenarios, we randomly chose ρ nodes from (1) the 400 nodes in the network, (2) the 100 nodes at the corner close to the gateway, (3) the 100 nodes clustered in the middle of the network, and (4) the 100 nodes at the far end corner from the gateway. We varied ρ from 10, 20, to 40. We also examined a baseline scenario, where Melete and Trickle were used to distribute code to all 400 nodes in the network. By doing so, we examined both uniformly distributed and clustered group formations. The presented data was averaged over 200 runs of the packet-level TOSSIM.

# of requests


120

Completion time (Sec)

# of forwarded requests

160

300 200 100 0

1

2

3 Scenarios

Completion time

(b) # of chunks

We observed a dramatic increase in both the number of forwarded requests and the completion time, when the number of chunks per capsule increased from 1 to 2. By analyzing the simulation traces, we found that this was because each forwarder could serve requests for exactly one code capsule. Thus, when multiple capsules were simultaneously requested, a forwarding region might be blocked in its discovery stage by existing forwarding regions around it. The blocked forwarding region could only progress after the existing forwarding regions had completed, causing greatly increased completion time and number of forwarded requests. This “blocking phenomena” became severe with p > 1 due to increased completion time. However, the blocking phenomena did not affect the forwarding stage. The number of forwarded chunks scaled almost linearly with the number of chunks per capsule. We observed that code caching alleviated the blocking phenomena. By using code caching, we achieved 20-30% reduction in the number of forwarded requests and 20-50% reduction in the completion time. Furthermore, we found that only a very small portion of the requests were actually affected by the blocking phenomena. In Figure 5, we show the Probability Density Function (PDF) of the number of forwarded requests for resolving all code requests in the 200 runs of the caching scenario, with p = 2. The PDF was close to an exponential distribution. This confirmed that most requests were resolved with little traffic: The large average values in Figure 4 are because of the very

ρ=10 ρ=20 ρ=40 BL Melete BL Trickle

60 40 20 0

1

2

3 Scenarios

(c) Completion time

Figure 6. Comparison to Trickle (400 nodes, 10, 20, and 40 group members, BL stands for Baseline) In Figure 6, we observe that the traffic and time costs of Melete were significantly better than those of Trickle in most scenarios, especially in Scenario 2 (the group is clustered around the gateway). For the baseline, Melete sent slightly higher number of requests, but smaller number of chunks. Also, since Melete maintained two groups (group 0 and the group for code distribution), the complete time of Melete was longer due to delay in disseminating version information. In Scenario 4, the costs of Melete and Trickle were similar. This was expected, since the requests and chunks needed to be forwarded across the entire network. We believe that adding more gateways can alleviate the problem.

# of forwarded requests

We also observed that using Melete, the number of requests and chunks increased with ρ. This indicated that, to disseminate code to a small fraction of nodes, selective methods such as Melete are more preferable than Trickle. The completion time of Melete decreases with ρ, since more requesters lead to faster response time.

200 150 100

was because 20 feet spacing resulted in very few good quality communication links, and hence slow expansion of the forwarding region. The exponentially decreasing advertising rate led to a more dramatic increase in the discovery time, compared to the increase in the number of forwarded requests. Load Distribution: A balanced load distribution is important to ensure a long lifetime of the entire network. We simulated a 400 node network on a 20x20 grid with 15 feet spacing factor using the bit-level TOSSIM. In Figure 8, we illustrate the communication distribution for code chunks in one simulation run with p = 2. The x and y-axes represent space, with nodes located at line intersections.

50 0

30

1


90 80

4

25 70 20

60

(a) # of forwarded requests

50


15

40

160

10

120

5

20 10

80

(a) Transmissions

40 0

1


4

(b) # of forwarded chunks Completion time

30

200 150 100

space=5 space=10 space=15 space=20

1


Figure 8. Communication topography of a simulation run (400 nodes, 4 groups, each with 20 members and 20 - 40 requests in 10 minutes) We observed that most of the sensor nodes transmitted less than 20 chunks and received less than 60 chunks. More communications were performed by nodes around the center, since they were more likely to be involved in a forwarding region than nodes at the boundary. Also, the locations of nodes with more communication in the transmission topography match those in the reception topography.

50 0

(b) Receptions

4

(c) Completion time

Figure 7. Impact of node density on the network traffic and completion time (400 nodes, 4 groups with 20 members each, 20 to 40 requests per group) Impact of Node Density: We varied the spacing factor to 5, 10, and 20 feet while keeping other parameters unchanged. In Figure 7, we illustrate the network traffic and completion time, grouped by number of chunks per capsule, p. The data was averaged over 200 runs of the packet-level TOSSIM. We observed that both network traffic and completion time increased with p, which was understandable. With a fixed p, the number of forwarded requests decreased with node density. This was because the number of nodes flooded by each request increased with node density, facilitating responder discovery. However, an intriguing phenomena was that the number of forwarded chunks increased with node density. Detailed simulation traces indicated that with high node density, the forwarding region was often unnecessary large, containing multiple responders. Since the Trickle parameter m scaled logarithmically with node density [12], redundant chunks were propagated throughout the forwarding region. This was confirmed by the logarithmic increase in the number of forwarded chunks, with respect to node density. We also observed a dramatic increase in the completion time when the spacing was varied from 15 to 20 feet. This

6 Implementation 6.1 Supporting Concurrent Maté Applications Our implementation of Melete is based on the TinyOS version 1.1.15 and Maté version 2.2.2. We have successfully implemented four Maté contexts in Melete: Once, Reboot, Timer, and Broadcast. To separate storage and execution space of each application, we duplicated several data structures in Maté, including capsules, stacks, registers, heaps, locks, and those for various contexts and instructions. Each component (including contexts) maintains an application pointer that indicates the data structure of the currently active application. Whenever a context for a specific application is invoked, all components are signaled to switch their application pointers accordingly. Thus, the overhead of context switching is limited to overwriting the application pointers. A special context is the Broadcast context. In our implementation, a sensor node can specify a set of target groups as the receiver of a broadcast. This group information is encoded in every broadcasted message. Upon receiving a broadcasted message, a node stores the data for all target groups. The broadcasting instructions (i.e., bcast() and bcastbuf()) were also modified accordingly.

6.2 Dynamic Grouping Each sensor node maintains a vector indicating its current group status. A set of new instructions were added into Maté for a node to join or leave a group (joingrp() and leavegrp()), query its group status (chkmember()), and identify the associated group of a code capsule (group()). Also, each node

maintains a mapping between the associated groups and the data structures in Maté. This mapping is used to switch the application pointers when group switching occurs. Furthermore, when a node joins a group, all corresponding contexts are initialized. The code version information is also reset so that the latest code will be requested. When a node leaves a group, its contexts are reset. In particular, timers in the Timer context are stopped. Also, its contexts are removed from the ready and running queues of the VM, and the task queues of the sensing and communication interfaces. Figure 9. An office space for empirical study (16 nodes, 6, 7, 12 are requesters) communication related to the code dissemination and then reported to node 0 after 5 minutes. Nodes 6, 7, and 12 also recorded the time delay to receive the code and reported to node 0 after 5 minutes. We set H = 1 in our study. 2

2.5

1.5

Responses

To incorporate the FORWARD state into Trickle, we added data structures to maintain the version information of all groups in the network. One extra code chunk storage was also added as the forwarding cache. The code version, status, and chunk messages of Trickle were modified to include group information. Since the status message served as a request in Melete, a T T L field was added into the message. In our current implementation, state transition for a specific application does not affect the execution of other applications. For example, for a node in the REQUEST state, the execution of the requested application is halted, but other applications are executed as normal. To do so, we maintain a flag for every application in the VM. State transition in Trickle affects the flag of the corresponding application only.

Requests

6.3 Group-Keyed Code Dissemination

1 0.5

# of applications in Melete 1 2 3 4 5 Code 36.7 47.5 48 48 48 48 Memory 2.9 3.2 4.7 6.2 7.7 9.2 Table 2. Code size and memory usage (in KB) Maté

7 Empirical Study In this section, we first study the performance of the groupkeyed code dissemination in an office area. We then demonstrate the usefulness of dynamic grouping through an example application for motion tracking.

7.1 Code Dissemination in An Office Space We studied the performance of our code dissemination with a testbed of 16 TelosB nodes. These nodes were deployed in an office area of approximately 120 × 50 feet (Figure 9). All nodes were members of groups 0. Nodes 0, 6, 7, and 12 were also members of group 2. Node 0 was connected to a computer and used as a gateway. We tuned the transmission power of nodes so that the hop distance from nodes 6, 7, and 12 to node 0 were 2, 2, and 3 hops, respectively. In each experiment, we updated a code capsule of group 2, which consisted of 2 chunks. All nodes recorded their

1 0.5

0

3

7 11 sensor node ID

15

(a) Transmitted requests Time delay (Sec)

In Table 2, we compare the code size and memory usage of Maté and Melete. Each application contains 5 contexts (Once, Reboot, Broadcast, and two Timer contexts). For Melete, 4 extra instructions (Section 6.2) are included to support minimal grouping operation. Although not presented herein due to space limitations, we have also implemented other operations to exchange group information between neighbors on a timed cycle and a set of Maté opcodes to utilize the exchanged group information. We can see that the memory usage of Melete increases almost linearly with the number of applications. However, as the number of applications does not affect the logic of the system, the code size remains constant with the exception that index computations are eliminated when only one application is supported.

0

2 1.5

0

0

3

7 11 sensor node ID

15

(b) Transmitted responses

30 20 10 0

1

10

20 30 Experiments

40

50

(c) Time delay

Figure 10. Code dissemination in an office area (Figure 9) In Figures 10(a) and (b), we demonstrate the network traffic for each node averaged over 50 experiments. We observed that most nodes forwarded less than one request and less than 1.5 responses. Nodes further away from node 0 tended to send more requests. Also, nodes in the middle (4, 5, 8, 9, 10, 11) forwarded more responses than others, because they were involved in the forwarding region for almost all 50 experiments. Figure 10(c) shows the time to complete updating the three requesters in 50 experiments, sorted in increasing time. The dotted line gives the average time, 12.8 Sec. Note that since the measured time delay also included the time to propagate the version information, it was different from the notion of completion time used in Section 5. We observed that the completion time for most nodes was within a few seconds.

7.2 Motion Tracking Application We placed an array of 11 sensor nodes in a straight line in the office area shown in Figure 9, with around 10 feet spacing between them. Node 0 was designated as a gateway. We tuned the radio transmission power so that it took 4 hops to communicate between nodes 0 and 10. The sensors were placed under over-head light. A shadow over the sensor array

Timer context of G0 1. if (int(photoactive()) < 200) then 2. joingrp(2); // join group G2 if under shadow 3. end if Timer context of G2 1. buffer received, data; 2. shared leader; 3. private reading; 4. reading = int(photoactive()); 5. if (reading > 200) then 6. ... 7. leavegrp(group()); 8. else 9. leader = 1; 10. received[] = id(); 11. received[] = reading; 12. ... 13. data[0] = id(); 14. data[1] = reading; 15. bcast(4, data); 16.end if

Highest rate = 1 Highest rate = 2 15 10 5 0

1

5 ID of sensor nodes

10

Figure 12. Time delay for dynamic grouping and code transportation with a motion tracking application To further demonstrate the effectiveness of dynamic grouping, we also designed another application, G3 , to form a group with sensor nodes one-hop away from the shadow. We slightly changed line 15 of the Timer context of G2 so that the light reading was broadcasted to both G0 and G2 (we did so by setting the bitmask to 5; the number of broadcasts remain unchanged). In the Broadcast context of G0 , nodes not in G2 (by calling chkmember()) would join G3 . We successfully ran the experiment with the same settings as above.

8 Conclusion and Future Directions // report “received” if leader // leave the associated group

// report “received” if it is full // prepare data to broadcast // 4 is the bitmask of G2

Broadcast context of G2 1. buffer received, buf; 2. shared leader; 3. buf=bcastbuf(); 4. if (leader) then 5. if (id() < buf[0]) then 6. received[] = buf[0]; 7. received[] = buf[1]; 8. ... 9. else 10. leader = 0; 11. bclear(received); 12. end if 13. end if

nodes. We controlled the moving speed of the shadow so that when a node joined G2 , it would always have enough time to obtain code from its predecessor. Thus, code forwarding was not invoked and we could focus on dynamic grouping. We varied the highest advertising rate from once per second to twice per second. In Figure 12, we report the time delay for each node to join G2 and receive all the code, averaged over 10 runs. We observed a significant improvement of the time delay by increasing the highest rate: from 12.4 seconds to 5 seconds on average. The large variations in the results might be caused by the dynamics in the office environment.

Time delay (Sec)

was created by blocking the light using a book. By adjusting the book, we could manipulate the size and position of the shadow. We designed the following application to track the shadow with dynamic grouping. We used group G0 for grouping purpose. A background application, G1 , was used to sensed the humidity and broadcasted to its neighbors every 10 seconds. Another application, G2 , was used to track the shadow. Nodes would join G2 if they were under the shadow and leave G2 otherwise. Initially, node 0 was set to be a member of all groups so that it stored all the code. Every other node was set to be a member of G0 and G1 only. We used a Timer context of G0 to sense the light once per second. Whenever the light reading of a node fell below a threshold, the node would join G2 . For efficient data gathering in G2 , we also implemented a simple data aggregation algorithm. Specifically, a Timer context in G2 sensed the light once every half second. It then broadcasted the light readings to one-hop neighbors in G2 , which triggered a Broadcast context of G2 . In the Broadcast context, each node received the data from its neighbors. Then, the node with the smallest ID combined the received data into one packet to report to node 0. The corresponding TinyScript programs are sketched in Figure 11. The three code capsules for G2 consisted of 6 chunks in total.

// compare the IDs // aggregate data from neighbors // report “received” if it is full // no aggregation

Figure 11. Code capsules for motion tracking application In each experiment run, the shadow was initially casted on node 1 and then gradually moved towards node 10. The size of the shadow was adjusted to always cover two adjacent

We have presented the Melete system that supports deployment of concurrent applications in a WSN, which is crucial for commercial deployment of WSNs. We have discussed the design and implementation of Melete to support the storage and execution of concurrent applications over Maté virtual machine, to support flexible application organization and deployment using dynamic grouping, and to support groupkeyed code dissemination above Trickle. We have also examined, via both analysis and simulations, various techniques to optimize the code forwarding process, a key component of the group-keyed code dissemination. These techniques include lazy forwarding, progressive flooding, and randomized code caching. Our empirical study has validated the efficiency of group-keyed code dissemination. We have also demonstrated the usefulness of dynamic grouping using a motion tracking application in real deployments. By using TinyScript, Melete inherits programmability from Maté. The protected storage and execution space for each application ensures reliable application deployment and execution from multiple users. Dynamic grouping improves flexibility in application development and maintenance, with respect to variations in environment and user requirements. Moreover, our analytical and simulation results have shown that the code forwarding mechanism scales well with both code size and node density. We plan to further explore techniques for efficient code dissemination. For example, we can suppress redundant forwarders to reduce code forwarding costs and avoid the blocking phenomena.

Melete assumes no specific network topology other than a connected graph. However, some practical WSNs are deployed using either tree or mesh-like network topology. It is therefore interesting to investigate the usefulness of such topologies on both dynamic grouping and code dissemination. We are also interested in integrating various sleep scheduling techniques into Melete. Along these lines, we are interested in adapting Melete to support the unifying sensornet protocol (SP) [36] (Trickle is adapted to support SP in [36]). Given the harsh resource constraints inherent in many real-world deployments of WSN, it appears important—while supporting multiple dynamic and concurrent applications— to provide for application-driven or application-adaptive quality-of-service (QoS) features. In long-term commercial deployments of Melete, it is important to learn application requirements beyond those expressed by a series of calls to e.g. the TinyScript compiler and byte-code injector. For example, given the constrained resources of a sensor network, we would like to incorporate admission control into a gateway system that supports Melete to ensure QoS of applications. Application prioritization for differentiated QoS is also important in situations involving mission-critical applications. Other issues include preventing poorly implemented applications from improperly reorganizing its group membership or updating its code too frequently. Implementing a formal cost model, constraint checks and pacing features in the WSN gateway are possible directions. Two other systems have been partially developed under the auspice of the Muse research project. Mneme4 is a gatewayresident component which records every past applicationlevel programming request by users of the network and their general disposition once injected into the network (assuming they were allowed by the admission control policy and other constraint checks). Facts about the disposition are incrementally learned as various data reports return to the gateway after the related request injection. The Aœde5 application-adaptive messaging system (described in the context of a larger system in [37, 38]) or another general messaging layer may be used to report data from the node population to the gateway.

Acknowledgments This work was supported by Motorola Labs at Schaumburg, Illinois. The authors would like to thank Nitya Narasimhan (of PPAL), Venu Vasudevan (director of PPAL), Ross J. Lillie, Brian Lucas, Ralph D’Souza, Matt Perkins, Ken Cornett, Tim Bancroft (all of Motorola Labs), Yeon Jun Choi, Bhaskar Krishnamachari, the anonymous reviewers, and our shepherd, Andreas Savvides, for their valuable discussion and suggestions on this work. The authors also thank Phil Levis for releasing the Maté source code.

9 References [1] P. Levis, D. Gay, and D. Culler, “Active sensor networks,” in NSDI, Mar. 2005. [2] Y. Yu, B. Krishnamachari, and V. K. Prasanna, “Issues in designing middleware for wireless sensor networks,” IEEE Network Magazine, special issue on Middleware Technologies for Future Communication Networks, vol. 18, no. 1, pp. 15–21, Jan. 2004. [3] P. J. Marrón, A. Lachenmann, D. Minder, J. Hähner, R. Sauter, and K. Rothermel, “TinyCubus: A flexible and adaptive framework for sensor networks,” in European Workshop on Wireless Sensor Networks, Jan. 2005. [4] J. Steffan, L. Fiege, M. Cilia, and A. Buchmann, “Towards multi-purpose wireless sensor networks,” in International Conference on Sensor Networks, Aug. 2005. [5] V. Bhandari and L. J. Rittle, “A group programming architecture for multifunctionality wireless sensor networks,” in International Conference on Networked Sensing Systems (INSS), May 2006. 4 Mneme 5 Aœde

is the Muse of Memory in Greek mythology. is the Muse of Song in Greek mythology.

[6] W. F. Fung, D. Sung, and J. Gehrke, “Cougar: the network is the database,” in ACM SIGMOD, June 2002. [7] S. R. Madden, M. J. Franklin, J. M. Hellerstein, and W. Hong, “TAG: a Tiny AGgregation service for ad-hoc sensor networks,,” in Symposium on Operating Systems Design and Implementation (OSDI), Dec. 2002. [8] C.-L. Fok, G.-C. Roman, and C. Lu, “Rapid development and flexible deployment of adaptive wireless senosr network applications,” in ICDCS, June 2005, pp. 653– 662. [9] A. Boulis, C. C. Han, and M. B. Srivastava, “Design and implementation of a framework for programmable and efficient sensor networks,” in ACM MobiSys, May 2003, pp. 187–200. [10] L. Szumel, J. LeBrun, and J. D. Owens, “Towards a mobile agent framework for sensor networks,” in IEEE Workshop on Embedded Networked Sensors, May 2005. [11] C. Frank and K. Römer, “Algorithms for generic role assignment in wireless sensor networks,” in ACM SenSys, Nov. 2005. [12] P. Levis, N. Patel, D. Culler, and S. Shenker, “Trickle: A self-regulating algorithm for code propagation and maintenance in wireless sensor networks,” in NSDI, Mar. 2004. [13] L. J. Rittle, V. Vasudevan, N. Narasimhan, and C. Jia, “MuSE: Middleware for using Sensors Effectively,” in International Conference on Networked Sensing Systems (INSS), May 2005. [14] J. Liu, M. Chu, J. Liu, J. Reich, and F. Zhao, “State-centric programming for sensor and actuator network systems,” IEEE Pervasive Computing, vol. 2, no. 4, pp. 50–62, Oct. 2003. [15] M. Welsh and G. Mainland, “Programming sensor networks using abstract regions,” in NSDI, Mar. 2004. [16] K. Whitehouse, C. Sharp, E. Brewer, and D. Culler, “Hood: A neighborhood abstraction for sensor networks,” in ACM MobiSys, June 2004. [17] C.-Y. Wan, A. T. Campbell, and L. Krishnamurthy, “PSFQ: A reliable transport protocol for wireless sensor networks,” in WSNA, Sep. 2002. [18] J. W. Hui and D. Culler, “The dynamic behavior of a data dissemination protocol for network programming at scale,” in ACM SenSys, Nov. 2004. [19] T. Liu and M. Martonosi, “Impala: A middleware system for managing autonomic, parallel sensor systems,” in ACM Symposium on Principles and Practice of Parallel Programming, June 2003. [20] Q. Huang, C. Lu, and G.-C. Roman, “Spatiotemporal multicast in sensor networks,” in ACM SenSys, Nov. 2003. [21] A. Sheth, B. Shucker, and R. Han, “Vlm 2 : A very lightweight mobile multicast system for wireless sensor networks,” in IEEE Wireless Communications and Networking Conference (WCNC), Mar. 2003. [22] Z. Cheng and W. B. Heinzelman, “Searching strategy for multi-target discovery in wireless networks,” in Workshop on Applications and Services in Wireless Networks, Aug 2004. [23] N. Chang and M. Liu, “Revisiting the TTL-based controlled flooding search: Optimality and randomization,” in MobiCom, Sep. 2004. [24] N. Sadagopan, B. Krishnamachari, and A. Helmy, “Active query forwarding in sensor networks (ACQUIRE),” Journal of Ad Hoc Networks, vol. 3, no. 1, pp. 91–113, Jan. 2005. [25] D. Bradinsky and D. Estrin, “Rumor routing algorithm for sensor networks,” in WSNA, Sep. 2002. [26] S.-Y. Ni, Y.-C. Tseng, Y.-S. Chen, and J.-P. Sheu, “The broadcast storm problem in a mobile ad hoc network,” in MobiCom, Aug. 1999. [27] I. Clark, O. Sandberg, B. Wiley, and T. Hong, “Freenet: A distributed anonymous information storage and retrieval system,” in Workshop on Design Issues in Anonymity and Unobservability, July 2000. [28] E. Cohen and S. Shenker, “Replication strategies in unstructured peer-to-peer networks,” in ACM SIGCOMM, Aug. 2002. [29] B. Krishnamachari and J. Ahn, “Optimizing data replication for expanding ringbased queries in wireless sensor networks,” in WiOPT, Apr. 2006. [30] J. Polastre, R. Szewczyk, and D. Culler, “Telos: Enabling ultra-low power wireless research,” in ACM/IEEE International Symposium on Information Processing in Sensor Networks, Apr. 2005. [31] Z. Cheng and W. Heinzelman, “Flooding strategy for target discovery in wireless networks,” in International Workshop on Modeling Analysis and Simulation of Wireless and Mobile Systems, 2003, pp. 33–41. [32] Y. Yu, L. J. Rittle, V. Bhandari, and J. B. LeBrun, “Supporting concurrent applications in wireless sensor networks,” Motorola Labs, Tech. Rep., 2006. Available: http://techpubs.motorola.com/IPCOM/139103 [33] P. Levis, N. Lee, M. Welsh, and D. Culler, “Tossim: Accurate and scalable simulation of entire tinyos applications,” in ACM SenSys, Nov. 2003, pp. 126–137. [34] A. Woo, T. Tong, and D. Culler, “Taming the underlying challenges of multihop routing in sensor networks,” in ACM SenSys, Nov. 2003. [35] P. J. Clark and F. C. Evans, “Distance to nearest neighbor as a measure of spatical relationship in populations,” Ecology, vol. 34, pp. 445–453, 1954. [36] J. Polastre, J. Hui, P. Levis, J. Zhao, D. Culler, S. Shenker, and I. Stoica, “A unifying link abstraction for wireless sensor networks,” in ACM SenSys, Nov. 2005. [37] H. Zhang, L. J. Rittle, and A. Arora, “Application-adaptive messaging in sensor networks,” The Ohio State University, Tech. Rep. OSU-CISRC-6/06-TR63, 2006. [38] H. Zhang, A. Arora, L. J. Rittle, and P. Sinha, Handbook of Real-Time and Embedded Systems. CRC Press, to appear, ch. Messaging in Sensor Networks: Addressing Wireless Communications and Applications Diversity.

Supporting Concurrent Applications in Wireless ... - Semantic Scholar

Supporting Concurrent Applications in Wireless ... - Semantic Scholar

Suggest Documents

Supporting Concurrent Applications in Wireless Sensor ... - CiteSeerX

Supporting Data Consistency in Concurrent ... - Semantic Scholar

Defining and Supporting Concurrent Engineering ... - Semantic Scholar

Supporting Concurrent Design by Integrating ... - Semantic Scholar

Supporting service differentiation in wireless ... - Semantic Scholar

Wireless Sensor Networks: Applications and ... - Semantic Scholar

wireless sensor networks: applications utilizing ... - Semantic Scholar

UWB Antennas for Wireless Applications - Semantic Scholar

Wireless Networked Biological Applications - Semantic Scholar

Supporting Parallel Soft Real-Time Applications in ... - Semantic Scholar

Scheduling Concurrent Applications on a Cluster ... - Semantic Scholar

A Wireless Sensor Network Testbed Supporting ... - Semantic Scholar

Slack Elasticity in Concurrent Computing - Semantic Scholar

Concurrent Programming Constructs in Multi ... - Semantic Scholar

Change management in concurrent engineering ... - Semantic Scholar

Monitoring Data Dependencies in Concurrent ... - Semantic Scholar

Inheritance and Synchronization in Concurrent ... - Semantic Scholar

Dependability in RESCUE: A Concurrent ... - Semantic Scholar

Preventing Recursion Deadlock in Concurrent ... - Semantic Scholar

Supporting Concurrent Ontology Development: Framework ...

CONCURRENT VARIABLE-INTERVAL ... - Semantic Scholar

Concurrent clause strengthening - Semantic Scholar

Concurrent Software Engineering - Semantic Scholar

Induction chemotherapy, concurrent ... - Semantic Scholar