USC/ISI Technical Report ISI-TR-2005-608 Submitted to Mobisys 2006, Uppsala, Sweden. June 19-22, 2006.
RBP: Reliable Broadcast Propagation in Wireless Networks Fred Stann, John Heidemann, Rajesh Shroff, Muhammad Zaki Murtaza Abstract connectivity and poorly convergent global knowledge are the norm. Approaches like application layer jitter, MAC layer random slot selection [36], and PHY layer capture [35] can help to lower the probability of unrecoverable broadcast collisions, but they don’t guarantee broadcast propagation. Finally, very few proposed improvements for broadcasting have been tested outside of simulation. Real-world topologies rarely have uniform density. Node density in wireless networks is particularly important for broadcasting because it correlates to the number of disjoint paths between node pairs, which has a major impact on the probability of eventual packet delivery [26]. So, from a MAC and PHY layer perspective, reliable broadcasting can be difficult in some environments. Perhaps because of these challenges, widely available protocols, such as 802.11, do not provide general support for reliable broadcast. Nonetheless, broadcasting is an integral part of popular protocols for data dissemination in wireless networks. When AODV [24] has no route to a destination node, it broadcasts Route Request messages. Similarly, ODMRP [17] broadcasts a Join Query to initially find target nodes. Tiny DB floods an SRT build request in order to construct attribute-based Semantic Routing Trees [20]. IDSQ floods an m-hop neighborhood in order to initially locate a moving target [41]. Flooding-limitation protocols like SPIN [15] and BARD [32] degrade to flooding when there is no overlap of in-network history and the current query. This paper aims to mitigate the reliance of higher-layer protocols on the reliability of broadcasting by providing reliable broadcast as a system service. In this work we examine reliable broadcast in the context of multi-hop routing and resource discovery. The challenge of reliability is greater in low power networks, and complexities of interference are poorly understood and modeled in simulation. We therefore base the results of our work on testbed experiments from an available 20node sensor network, augmented by simulations to systematically explore the design space. In wireless sensor networks, broadcast reliability is directly affected by radio interference, hop counts, hidden nodes, congestion, node density, and transient interference patterns [40, 34]. The limited radio power of typical sensor network motes necessitates multi-hop routes from data sources to data sinks [38]. Reliability for broadcasting can be thought of as the probability that a broadcast “wave” will reach the most distant “tier” of nodes away from the source of the
Low-powered radios, interference patterns, and multi-hop connections make most wireless networks inherently unreliable. While MAC protocols provide high reliability for unicast via mechanisms such as ARQ, higher-layer network protocols often require broadcast transmissions as well. Prior work on broadcast improvements has focused on efficiency and generally not related reliability to general routing protocols. Existing broadcast protocols often require multi-hop topology information. Instead, we present a very simple protocol, requiring only local information, and highlight the performance impact of our protocol on routing with a secondary evaluation of energy efficiency. We develop our protocol as a service between the MAC and network layer, taking information from both. Our approach is based on two principles: First, we exploit network density to achieve nearperfect end-to-end reliability by requiring moderate (50-70%) reliability when nodes have many neighbors. Second we identify areas of sparse connectivity where important links bridge clusters of dense nodes, and guarantee connectivity over those links. Routing performance depends on wireless propagation, so we develop a new error model that considers both correlated and independent loss in broadcast traffic based on testbed experiments. We demonstrate, through controlled simulations using this model, and through complete testbed experiments, that this hybrid approach is necessary to provide near-perfect accuracy with good efficiency. In a real testbed we show 99.8% accuracy with 48% less overhead than through repeated flooding. The contributions of this paper are the introduction of our protocol Reliable Broadcast Propagation, definition of metrics that balance efficiency and reliability, and introduction of a more accurate model for broadcast error.
1 Introduction Numerous studies have documented a wide variance in reliable packet delivery for energy-constrained wireless environments [40, 38, 41, 31]. This physicallayer problem can be ameliorated for unicast packets via MAC layer mechanisms such as CTS/RTS/ACK, ARQ, and FEC [16, 39]. Unfortunately, control traffic overhead has prevented such techniques from being effective for PHY layer broadcasting. There is a wide range of prior work looking at improving broadcast reliability and efficiency. Reliable broadcast schemes based on TDMA [6] and self-pruning [3], employed in high powered wireless networks do not adapt well to low power networks where dynamic This work was partially supported by the National Science Foundation (NSF) under grant number NeTS-NOSS-0435517, "Sensor Networks for Undersea Seismic Experimentation", and by support from Intel Corporation. Fred Stann received support from Xerox Corporation. Fred Stann and John Heidemann are with USC/Information Sciences Institute, 4676 Admiralty Way, Marina Del Rey, CA, USA E-mail:
[email protected],
[email protected]
1
broadcast [37]. Broadcasting in wireless is most often achieve via flooding, which is the retransmission of a packet exactly once by each node in the network [23]. In this paper we present a new protocol that enhances the reliability of flood-based broadcasting in wireless networks. The goal of our protocol is to ensure that a near perfect percentage of broadcasts reach all tiers. A general category of task that can be impacted by unreliable broadcast is “resource discovery.” For routing protocols, like DSR [14] and AODV [24] this is simply target location. Resource discovery for applications is generally concerned with the location of nodes that are generating data of interest to an application sink. The reliability of resource discovery could be defined as the ratio of discovered nodes matching a query to the actual number of relevant data sources in the network. Wireless sensor network data dissemination protocols often combine resource discovery with attribute-based routing [10]. With attribute-based routing, unreliable broadcast can simultaneously affect both application and routing. For a certain class of application, reliable resource discovery is essential. Such applications exhibit a rapid decay in efficacy when even a single exploratory epoch fails to find targeted resources. Examples include interactive applications where a user is waiting for a query response [27], real-time applications like targettracking [13, 41], and signal processing applications whose accuracy depends on a critical mass of data [18]. In this paper we present a new protocol for reliable broadcasting. Our protocol is distributed in the sense that every node makes its own decisions about retransmission without any global or hard state. Nodes examine traffic in their local neighborhood to gauge the degree of broadcast propagation. Decisions to retransmit a broadcast are parameterized by local node density. In our protocol, lower density equates to a higher likelihood of retransmission for imperfect propagation. Naturally, chain reactions that result in excessive flooding must be squelched. Timeliness is a natural side effect of our protocol because, with it, a single flood has a higher probability of achieving a sought for response. The same probability may be achieved via multiple less reliable floods, but timeliness will suffer and, as we will show, the effective overhead will be worse. Details of the protocol are presented in Sections 3 and 4. The basic methodology of our investigation is to use testbed experiments and simulations to validate a protocol based on analysis. The goal of our experiments is to unearth the cost of our scheme relative to the reliability that it provides. The contribution of this paper, then, is threefold. First, it presents a new protocol, Reliable Broadcast Propagation (RBP), as the preferred technique for wireless network applications that rely on broadcasting for timely and accurate system response. Second, it
defines a new broadcast metric that balances efficiency with reliability . Third, it introduces a new simulation error model for more accurate modeling of packet reception that is especially important for experiments related to broadcasting.
2 Related Work Related work for this paper can be roughly divided into five categories: wired-network broadcast reliability, reliable broadcast schemes for MANETs, wireless MAC layer reliability, transport layers for sensor networks, and, to a limited extent, resource discovery algorithms. Broadcast reliability for wired networks is concerned with upper-layer tasks such as the dissemination of routing information, real-time multicasting, and database replication. Protocols, like OSPF [21] and PGM [30], rely on the maintenance of unicast trees for the delivery of ACKs or NACKs to trigger retransmissions. Database replication techniques rely on eventual data convergence over time via repeated or multiple random transmissions [5]. Because point-topoint networks can not exploit the ubiquitous nature of wireless broadcasting, there seems to be little that can be learned from these techniques applicable to wireless networks. Nonetheless, work in reliable OSPF has demonstrated that broadcast reliability is directly affected by the number of shared edges that exist among alternate paths between node pairs [26]. Because wireless broadcasting can be modeled schematically as a set of discreet connections [34], shared-edge analysis is an effective tool for predicting the probability of eventual delivery of a broadcast packet to all nodes in a wireless network. Most wireless broadcast analysis is predicated on an assumption uniform density. We demonstrate in Section 5 that local density, in the context of shared-edge analysis, can have a profound affect on the probability that all nodes in a wireless network receive a broadcast. Broadcasting schemes for MANETs fall into four general categories: simple flooding, area-based methods, neighborhood-knowledge-based methods, and probabilistic schemes [36]. Ho et al. make several important points about the appropriateness of flooding for MANETs [12]. Flooding is topology-independent, and the redundancy of flooding makes it inherently reliable. Also, traditional broadcasting techniques that rely on the maintenance of hard state converge too slowly for dynamic environments. Area-based and neighborhood-based methods are aimed at efficient broadcasting without loss of reliability. They require knowledge of the absolute locations of nodes or the hop counts to nodes. Area and neighborhood schemes strive to eliminate redundancy from flooding. Nodes which can’t increase the scope of a broadcast prune themselves from the set of nodes that retransmits a broadcast. Our
2
approach is more concerned with high reliability rather than average reliability at low cost. Gossip-based ad-hoc routing is a probabilistic scheme which includes a heuristic to prevent premature gossip death [8]. The heuristic is sensitive to network density in that a decision to not forward a broadcast (due to failed probability) can be overruled when an insufficient number of neighbors have been overheard to transmit the same broadcast. The interesting aspect of many of these schemes for MANETs is the concept of making distributed decisions based on the snooping of local neighborhood traffic. The protocol that we present in the next two sections is somewhat similar to a one-hop self-pruning scheme [19] combined with the heuristic employed in Gossip. Commonly used wireless MAC layers, like 802.11 [16] and S-MAC [39] do nothing to ensure the reliability of PHY broadcasting other than random slot selection when a busy channel goes idle. Tang and Gerla have proposed the addition of broadcast ACKs to 802.11 [33]. Numerous TDMA schemes have been proposed that provide contention-free reliable broadcast [2]. Because TDMA schemes usually require cluster heads with intimate knowledge of the nodes they support, they are not typically used in low-power networks. TRAMA is a novel adaptation of TDMA to sensor networks in which nodes use a schedule exchange protocol and adaptive election to converge on slot assignments [22]. Node schedules are piggy-backed on data packets. Z-Mac is a sensor net MAC that uses a TDMA schedule when contention is high and CSMA for low contention [29]. It requires knowledge of two-hop neighbors, and schedule exchanges. It appears that TRAMA and Z-MAC have only been simulated; studies with real-world error conditions remain to be done. The goal for our protocol was to achieve networkwide reliability, so we also examined ideas from transport layer protocols for low-powered wireless networks. ESRT is a sensor network protocol that attempts to increase the reporting rate of a source until a requisite sampling rate is achieved at the sink (while avoiding congestion) [1]. RMST is aimed at the fragmentation and reassembly of binary objects that are larger than the network mtu [31]. ETX uses forward and backward reliabilities to select good end-to-end paths in a sensor network [4]. A common denominator of these protocols is that they are all concerned with unicast transmissions and require end-to-end book keeping. Since every node is an endpoint in a broadcast, transport layer mechanisms are not practical for our investigation. Nonetheless, our scheme may indirectly enhance transport and routing protocols that rely on lower-layer broadcasts to provide the information on which they base their decisions. Although we decided to use resource discovery as the upper layer protocol for our investigation, it is not the
primary concern of this paper. We mentioned, in Section 1, that AODV, ODMRP, TinyDB, Diffusion, IDSQ, and BARD all rely on broadcasting under certain circumstances. Even so, not all resource discovery schemes for wireless networks use broadcasting. For example, DCS/GHT [28] uses an underlying geographic routing layer to move data with common attributes to specific nodes by hashing attribute names into geographic coordinates.
3 Reliable Broadcast Algorithm We had several goals in mind when addressing the problem of reliable broadcasting in wireless networks. We wanted an algorithm that would provide near perfect reliability in a timely manner. Unlike prior work, we avoided multihop topology/schedule exchanges. We decided that our algorithm must adapt quickly to changing network conditions without extensive control overhead. Although reliability and efficiency are covariant, we wanted to balance cost and benefit. Finally, we wanted to verify our algorithm with real-world experiments. We therefore designed for non-uniform densities, optimizing for both density and direction. The algorithm that we evolved is very simple in that it only requires a node to know the identity of its one-hop neighbors (i.e. neighbors with whom a node has direct bi-directional communication). Accumulating a list of one-hop neighbors is a trivial task given that such neighbors are in direct communication with a node inside a relatively short window of time. Many MAC and routing protocols already accumulate this information. In contrast to the neighborhood-knowledge schemes alluded to in the last section, we are not trying to make a single broadcast more efficient. Rather, we wish to make a single broadcast more reliable, thereby reducing the frequency with which an upper-layer protocol needs to invoke broadcasting. If an unreliable broadcast propagates with P(S) = .90, then an application that required .99 reliability would need to broadcast twice to ensure success. Our goal is to make a single reliable broadcast more efficient than the equivalent number of unreliable broadcasts required to achieve the same level of reliability. With our algorithm, the first time a node hears a broadcast it retransmits the packet unconditionally, as in a normal flood. As additional neighbors transmit the same packet, the node listens ands keeps track of which neighbors have propagated the broadcast. Armed with one-hop neighbor knowledge, a node can ascertain the percentage of its neighbors that are guaranteed to have seen a packet. We call a transmission by a neighbor an implicit ACK., a term that typically refers to inbound traffic that indirectly acks outbound traffic [25]. When the number of implicit acks seen by a node falls below a predetermined threshold, a node will again retransmit the
3
Figure 1: A sample topology with varying density and appropriate thresholds.
Figure 2:Special Relationship between Black and Gray Node
With our directional sensitivity optimization, nodes keep a histogram of which neighbor was the first to transmit a previously unheard broadcast. For the black node in Figure 16 there would be a uniform distribution in such a histogram. The introduction of jitter for message forwarding in dissemination protocols guarantees that no single neighbor dominates. The gray node, however, sees most upstream traffic (for a time) from a single node. Our solution includes a moving window of time in which a directional histogram is maintained. If a single neighbor has a majority of the histogram, the node sends that upstream neighbor a control message indicating that it has a special relationship to this node. Any node that gets such a message will do up to 4 retries when that downstream node does not ack. This behavior is independent of the normal rules for retries based on density. It is possible for a node to have multiple “important links.“ The combined algorithm is shown in Figure 3.
broadcast packet. In order to avoid the undesirable regeneration of an entire broadcast, nodes hearing a broadcast more than once from the same node will send an explicit unicast ACK to that node (and not retransmit that packet). If the number of neighbors that haven’t acked a packet is less than the number of neighbors that have, the packet is unicast to the unacked neighbors. This results in the least overhead for the retransmission. A key optimization in our algorithm is that both retransmission thresholds and the number of retries are adjusted for neighborhood density. Higher density neighborhoods have lower thresholds with fewer retries. If, for example, there are three or less neighbors, a node will make up to three attempts to propagate the message to all neighbors. For four to six neighbors the threshold is 66% and the number of retries is two. When there is a dense local neighborhood (i.e. eight or more neighbors), “successful” propagation occurs when half of the neighborhood acks and only one retry is attempted if the threshold is not met. The analysis in Section 5 and feedback from early test runs was used in selecting appropriate thresholds. Figure 1 demonstrates the general idea of variable thresholding. We also introduced a “directional sensitivity” optimization into our algorithm. This was in response to a special case of non-uniform density that our testbed experiments revealed to us. When an upstream dense section meets a downstream sparse section, there can be a node at the edge of the dense section that has a large neighbor count, but is the sole provider of traffic to the first downstream node in the sparse section. This situation is depicted in Figure 2. The black node is the sole provider of traffic to the grey node, but it resides in a dense neighborhood. When the grey node fails to hear a broadcast from the black node, retries will rarely happen with our basic algorithm because there is a high probability that at least 50% of the black node’s neighbors will have acked the broadcast. The black node is incapable of recognizing its special relationship with the gray node, but the gray node can easily do so because all of its upstream traffic will come from the black node.
rbp_snoop_send (pkt) { if (broadcast) set rbp_timeout pass through to MAC } rbp_snoop_rec (pkt) { if ( (overheard broadcast by neighbor) OR (explicit ACK) ) mark neighbor ACKed pass through to Routing } rbp_timeout { if ( (percentACKed < threshold (neighborCount) AND (retryCnt