Autonomic Network-layer Multicast Service Towards ... - CiteSeerX

3 downloads 199374 Views 560KB Size Report
ticast architecture where a self-monitoring network layer multicast automatically ... components coop- erate to combat service degradation pertaining to three crit-.
Autonomic Network-layer Multicast Service Towards Consistent Service Quality Björn Brynjúlfsson Gísli Hjálmtýsson Networking Systems and Services Laboratory Department of Computer Science Reykjavik University, Reykjavík, Iceland Abstract In spite of significant work on real-time content distribution, providing consistent service quality across the Internet remains a challenge. Multicast applications further exacerbate that problem as the number of participants – clients, providers and administrative domains – and the level of heterogeneity in a single session is increased. Recently the increasing complexity of managing and operating the Internet has sparked off interest in autonomous networking. In this paper, we introduce vital enhancements to the elementary Single Source Multicast service that facilitates incremental deployment, self-configuration, self-optimization and self-healing. We further introduce an autonomic multicast architecture where a self-monitoring network layer multicast automatically adapts the network layer forwarding service to meet the service quality requirements requested by application level services. Keywords: Autonomic communications, multicast, self-configuration, consistent service quality

1

Introduction

In spite of significant work on real-time content distribution, providing consistent service quality across the Internet remains a challenge. For multicast content distribution applications, this challenge is aggravated due to increased complexity and heterogeneity in terms of session topology, receivers connectivity and session duration. Part of the challenge of providing consistent service quality is to coordinate application objectives and quality requirements with mechanisms and abstractions across the layers of network functionality. This work has been partly supported by the European Union under the E-Next Project FP6-506869.

Kostas Katrinis Bernhard Plattner Communication Systems Group Swiss Federal Institute of Technology Zurich, Switzerland One potential approach to address this problem would be to employ advanced QoS machinery [2][7] throughout the distribution network. However while effective in coping with congestion by providing preferential delivery for all packets of a multicast flow, increasing the cost throughout the distribution tree throughout the entire session, is overkill. In most parts of the Internet, losses and delays of best-effort traffic are acceptable for real-time content distribution on average. The real issue is excessive variance in service quality. An economical alternative is to exploit the temporal nature of bottlenecks, and employ best-effort forwarding by default, complemented with targeted localized enhancements dynamically applied at trouble spots and based on local information. To date, the realization of such an approach has been impeded by the static nature of the network core, the latter providing only for routing and forwarding functionality. Part of the past work on active networking (see Section 2) has proposed localized service-specific enhancements, however the viability of this model has been questioned due to the considerable complexity of service development. The realization that configuration management and system operation is becoming the most significant cost and complexity factor in running modern systems has sparked a research initiative towards a constantly evolving autonomic network. Researchers at IBM [14] have defined autonomic systems as satisfying four requirements: self-configuration, self-optimization, self-healing, and self-protection. Moreover, that such systems will configure themselves based on high level policies or service descriptions, specifying service objectives, rather than how to accomplish those objectives. Based on these principles, this paper proposes a network-layer service framework where nodes forming a multicast graph autonomously adapt the forwarding service to enhance the consistency in service delivery. The architecture of our system consists mainly of three

constituent parts: a) a self-configuring multicast topology management plane, b) a measurement and monitoring plane throughout the multicast topology and c) an autonomic component, reflexively adapting the forwarding service based on information received from the other two planes. We discuss how the three system components cooperate to combat service degradation pertaining to three critical QoS parameters: packet loss, delivery delay and route failures. We are currently evaluating the performance of the proposed system through simulations. Our preliminary results demonstrate that dynamic activation of link local retransmission achieves to significantly reduce service fluctuation experienced by a group of heterogeneous (in terms of loss rates) receivers. The primary contribution of this paper is a multicast service that provides for consistent service quality in realtime multicast content distribution. Secondly by combining simple elementary mechanisms and matching them autonomously to higher layer services a fully autonomic multicast service is realized. Furthermore this paper identifies for the first time the primitives that a future autonomic network architecture should offer to support autonomic multicast. Finally we show the viability of our ideas by evaluating them with simulations. The rest of this paper is organized as follows. After presenting related work in the following section, we discuss in the main section of this paper (Section 3) the three components that constitute the autonomic multicast service model. In Section 4 we prove the benefits of the architecture with simulations and in Section 5 we conclude.

2

Numerous tools have been developed to monitor the operation of the MBONE [16, 5, 1]. However, none of these approaches does incorporate the internal nodes of the multicast distribution tree(s), and hence is inappropriate for aiding the context-aware automatic adaptation of the distribution service. In contrast, gTrace [11] efficiently and accurately maintains key attributes, like group size and loss rate, throughout the multicast distribution tree as further discussed in [11]. Enhancing the elementary IP multicast service with higher-layer services, like reliability and class-based forwarding, implemented in the core of the network has been one of the popular applications of active networking researchers. Particular to reliability, [4] specifies a reliable multicast protocol over the Active Node Transport System. A similar approach is taken in [15] to show that network based processing and storage can be used to enhance the performance and scalability of reliable multicast. Although potentially employing per hop retransmissions at trouble spots, in contrast, our system does not attempt to provide reliable delivery, but is focused purely on enhancing the service delivery by reducing variance across the set of receivers and throughout the duration of the session. Moreover, our system does not require or assume programmability, smart packets or other esoteric forwarding path changes. Lately, the increasing management cost of network equipment and services has raised the interest for injecting autonomic features – like context awareness and distributed policy-based – control into next generation networks. To facilitate autonomic communication, well established parts of the network architecture may undergo serious reorganization, not excluding the network stack itself. In this context, we see our work as a contribution to this reorganization.

Related Work

Although significant amount of work exists on some of the elements of our system, we believe this work to be pioneering in its approach to how the elementary mechanisms are used and integrated to create an autonomic multicast service. IP multicast is still lacking wide deployment despite many years of research and standardization. Single Source Multicast (SSM [13]) has been proposed as an alternative, to the ASM (Any Source Multicast) model [6], that offers easier deployment, mainly by shifting part of the multicast functionality to higher layers and thus simplifying the service. We argue that the SSM service alone, as proposed in [13] does not provide for the necessary mechanisms to facilitate the organic growth of multicast as a universally available service. In our previous work on Self-configuring Lightweight Internet Multicast (SLIM) we have coped with these issues [9]. Later in this paper, we elaborate in the various autonomic characteristics of SLIM. For a more detailed comparison of the aforementioned protocols please refer to [9].

3

The Autonomic Multicast Architecture

The proposed router architecture is depicted in Figure 1. The system is built on top of the IP network, which provides for network connectivity and basic forwarding services. The highest layer contains policy descriptions encoding application semantics. We consider high level content descriptions describing objective and policies without specifying how these goals are achieved. For example, a description would prescribe whether the content is interactive or not. At the next layer is the Autonomic Real-time Multicast Distribution System, consisting of three parts, multicast topology management protocol (SLIM), measurement and monitoring (gTrace), and Autonomic MUlticast SErvice (AMUSE) component. Based on policy description and observations obtained from the monitoring subsystem, the AMUSE component coordinates with and/or autonomously manipulates the other two components to reduce variance 2

Evolving the topology towards minimum cost spanning trees: SLIM builds distribution trees on the shortest path from receivers to source, using the unicast routing information base. While adequate for short lived session, we are currently evaluating the potential of gradually migrating the distribution tree towards a minimum cost distribution tree to optimize data distribution of long-lived sessions. Adapting refresh rates: SLIM employs unreliable control messages to refresh soft state, using exponentially increasing time intervals between consequent control packet transmissions to reduce control overhead. Optionally, the exponential backoff can be dynamically adjusted to enhance robustness. These properties are discussed more thoroughly in [9, 10].

Figure 1. Router Architecture Block Diagram in service quality and provide QoS consistent with the policy description. In what follows we discuss how each individual component and our system as a whole is self-configuring, selfoptimizing, self-healing, and supports incremental deployment.

3.1

3.2

Measurement and Monitoring

The second component of the autonomic multicast architecture is the measurement and monitoring component. We have defined a set of basic monitoring primitives and implemented them with an associated protocol for requesting and exchanging information throughout the distribution tree [11]. Using local observation at the nodes of the tree (or the subset participating in the measurement protocol) combined with a simple request/collect mechanism, all nodes of a given topology efficiently learn key session-specific properties including topology information, packet count, group size, and diameter. The measurement and monitoring subsystem/protocol self-configures over the distribution tree based on a request from an upstream node (the collector) and supports incremental deployment as it does not assume that all nodes participate in the protocol. In prior work we have shown this protocol to be efficient and providing timely and accurate observations. Furthermore, we have shown the protocol to be robust, while rapidly selfcorrecting its estimates of the information maintained as shown in [11].

Multicast topology management

The most basic component of our system is the multicast topology management module. Our system is agnostic to particulars of the multicast protocol beyond the autonomic characteristics that we detail below. For the purpose of this work we use the SLIM protocol as defined in [9]. The protocol self-configures by dynamically constructing tunnels over non-cooperating routers and by coping with firewalls and NAT’s [10]. Control messages, JOIN and LEAVE, are sent towards the single source and processed by the SLIMenabled routers. All SLIM state is soft with periodic refresh signals, thus the protocol self-heals in case of network failures. In addition to the above properties the SLIM protocol has some noteworthy autonomic characteristics: Reducing forwarding state on routers: In network layer multicast, typically every node in the path from the receivers to the root will posses forwarding state in its datapath. However, multicast topologies tend to be sparsely connected and therefore only few nodes act as actual branch points. In our SLIM implementation we have implemented the policy that only routers at branch points maintain multicast forwarding state, and instead exploit the tunneling capabilities to tunnel datagrams between branch points, resulting in significant forwarding state savings. Load balancing by bounding fan-out: The multicast topology constructed by the SLIM protocol has significant flexibility and may be adjusted to limit and/or balance the replication work at the nodes of the topology as detailed in [9]

3.3

Autonomic Adaptation

The autonomic element of the overall system is completed with the third component - the Autonomic MUlticast SErvice (AMUSE) component responsible for adjusting the forwarding service as needed based on policy and service descriptions on the one hand and observations obtained from the monitoring subsystem on the other. The policy and service description distributed by the service provider define the service objectives. Three specific adaptation schemes are enabled by our architecture: Localized Loss Recovery: By comparing the local receive rate to the receive rate on a given downstream branch, 3

a node can determine if the losses on the downstream link are "excessive". The AMUSE agent can exploit this and activate local retransmissions on that particular link. Exploiting the knowledge of the number of leaves in a subtree (as collected by gTrace) can further be used to "assess the damage", incorporating into the activation decision the number of receivers affected by a congestion incident, potentially using different activation thresholds as the number of affected receivers increases. Reducing Delivery Delay: Apart from low loss rates, interactive applications require stringent limits on delivery delay and delay variation. Currently, real-time content distribution is suffering from the Internet’s best effort delivery service. Our system design can alleviate this problem by dynamically instantiating a service on bottleneck links to increases the queuing priority for specific flows. Fast Restoration: As connectivity in SLIM is maintained using periodic refresh of the soft-state, connectivity across the distribution tree is eventually restored, after route or link failures. However, the protocol refresh rate may become too long to provide sufficiently fast restoration due to exponential backof. In prior work we have experimented with a scheme for IP layer restoration in optical networks, resulting in restoration within a millisecond [8]. In [3] we elaborate on the components of our system and give a detailed service example.

Figure 2. Simulation Topology

To drive home the benefit and potential of our autonomic system, in this section we evaluate the overall benefit across set of receivers participating in a real-time multicast with preliminary simulations. Specifically, we show how the system autonomously reacts to consecutive congestion incidents and activates link local retransmissions as a countermeasure to reduce variation in service quality.

ing the processing overhead at routers, we use a uniform propagation delay of 10ms for all physical links. While this is a rather long delay, it is still two orders of magnitude smaller than the monitoring report update period we consider (1 sec). The rationale behind large propagation delays is to stress the reaction time and thus the efficiency of the retransmission service against long-haul links. Loss Model: We model packet loss experienced by data packets departing from the source using a per path two-state scheme. In healthy state, a link has an average loss rate of approximately 1% and in a congested state the loss rate is approximately 4%, both states adhering to a Bernoulli trial for packet dropping (independent losses). The duration of each of the two states is exponentially distributed with a mean duration of 5 and 2.5 seconds for the healthy and congestion state respectively except for the link between router 1 and router 2, which is always in healty state. The net effect of our loss model is that we manage to model a session with three classes of receivers, each class perceiving a different service quality fluctuation in terms of loss rate. More precisely, receiver 1 receives a steady stream with 1% average loss rate, receiver 2 experiences a loss rate varying from 1% to fairly 2.5% with an average of 1.5% and receiver 3 receives a stream with loss rate from 1% to 4.35% with an average of 3.5%.

4.1

4.2

4

Evaluation

Simulation Scenario

We realized an ns-based packet level simulator for our evaluation purposes [3]. Our experiment models a singlesource streaming session of 70 seconds with three classes of receivers. The Autonomic Content Distribution: Our simulator fully implements the specification of the SLIM multicast distribution protocol and the gTrace monitoring service as presented in [9] and [11] respectively. Finally, the AMUSE agent, has been implemented and is instantiated on each router of our simulation scenario. Router-level Topology: For the sake of demonstration we use a simple tree topology comprising of four routers, three receivers and a single channel source that is continuously transmitting at a constant bit rate of 1Mbps. Ignor-

Results

We simulate two runs of the above described session: One without using any retransmission scheme and a second using our autonomic multicast distribution architecture. In the second scenario, we set the retransmission activation threshold to 1,5% packet loss rate, the later measured over a gTrace update period (1 second) and keep it on for the entire duration of the experiment. For each run we execute sufficient number of replications so as for the standard deviation of the packet loss rate at each receiver to deviate less than 0.2% from its mean. Figure 3. compares for each run the loss rate statistics perceived by each of the three receivers. Obviously, using a primitive content distribution without any healing features, leads to a high variation of service quality received by each group member. This is 4

tation schemes.Furthermore we plan to extend the simulation model to more realistic scenarios. Our research shows the viability and substantial potential of our autonomic system to provide quality service over the Internet and allow for organic growth of multicast while minimizing cost and complexity of managing the service.

Figure 3. Per receiver loss probability with and without retransmission.

References [1] S. Alouf, E. Altman, and P. Nain. Optimal on-line estimation of the size of a dynamic multicast group. In IEEE Infocom, 2002. [2] R. Braden, D. Clark, and S. Shenker. Rfc-1633 integrated services in the internet architecture: an overview (ietf), 1994. [3] B. Brynjúlfsson et al. Autonomic network-layer multicast service - towards consistent service quality. Technical Report RUTR-CS06001, Jan 2006. [4] M. Calderón, M. Sedano, A. Azcorra, and C. Alonso. Active network support for multicast applications. In IEEE INFOCOM, 1998. [5] Y. Chen, D. Bindel, and R. H. Katz. Tomography-based overlay network monitoring. In ACM SIGCOMM conference on Internet measurement, 2003. [6] S. Deering, D. L. Estrin, D. Farinacci, V. Jacobson, C.-G. Liu, and L. Wei. The pim architecture for wide-area multicast routing. IEEE/ACM Trans. Netw., 4(2):153–162, 1996. [7] S. B. et al. Rfc 2475 an architecture for differentiated services internet engineering task force (ietf), 1998. [8] A. Greenberg, G. Hjálmtýsson, and J. Yates. Smart routers – simple optics: A network architecture for ip over wdm, 2000. [9] G. Hjálmtýsson, B. Brynjúlfsson, and Ólafur R. Helgason. Self-configuring Lightweight Internet Multicast. In IEEE SMC 2004, Hague, Netherlands, 2004. [10] G. Hjálmtýsson, B. Brynjúlfsson, and Ólafur Ragnar Helgason. Overcoming last-hop/first-hop problems in ip multicast. In Networked Group Communication (NGC), volume 2816, pages 205–213, januar 2003. [11] G. Hjálmtýsson, Ólafur Ragnar Helgason, and B. Brynjúlfsson. gtrace: Simple mechanisms for monitoring of multicast sessions. In NETWORKING, 2005. [12] G. Hjálmtýsson, H. Sverrisson, B. Brynjúlfsson, and . R. Helgason. Dynamic packet processors - a new abstraction for router extensibility. In OPENARCH, April 2003. [13] H. W. Holbrook and D. R. Cheriton. Ip multicast channels: Express support for large-scale single-source applications. SIGCOMM Computer Communications Review, 29(4):65– 78, 1999. [14] J. Kephart and D. M. Chess. The vision of autonomic computing. IEEE Computer, 36(1):41–50, 2003. [15] L.-W. H. Lehman, S. J. Garland, and D. L. Tennenhouse. Active reliable multicast. In INFOCOM (2), pages 581–589, 1998. [16] K. Sarac and K. Almeroth. Supporting multicast deployment efforts: A survey of tools for multicast monitoring, 2001.

clearly reflected in the mean and maximum loss rates experienced by each receiver class, where service degradation almost doubles as we progress from Rcvr1 to Rcvr3. In contrast, arming the distribution with autonomic features and particularly self-healing functionality through active retransmissions, achieved to provide for acceptably low loss rate across all receiver classes. When the threshold of the policy description is reached at receiver class 2 and 3 the retransmission is activated resulting in service quality within the policy. However no retransmission is activated for receiver class 1, as seen in the figure, since the measured loss rate does not exceed the loss threshold of the policy description. Apart from the mean and maximum loss rate over the entire session, keeping packet loss smooth and low over small timescales is a necessity to make the retransmission service appropriate for real-time applications. A shown in [3] on congestion occurrence, it managed both to keep loss down to negligible rates and at the same time condense the duration of the congestion incident.

5

Conclusion

In this work we have described a new autonomic multicast architecture. Combining application layer service objectives with local observations and adaptation results in an autonomic network service optimized across the layer boundaries to provide consistent service quality, matching the goals of the application level service. We have validated the system with preliminary simulations. While simple they show the benefits achievable in terms of reduced variation in service quality. We have implemented the SLIM and gTrace protocol on the Pronto architecture [12]. Our current activities include efforts to implement the AMUSE agent on the platform and to experiment with the system with a wider range of applications and in designing and experimenting with new adap5

Suggest Documents