AtriOSS: a management system for QoS enabled networks

3 downloads 204 Views 46KB Size Report
Keywords: Network management; QoS; SLA; DiffServ; Traffic engineering; MPLS. Abstract ... for new tools and methods for their management and operation.
AtriOSS: a management system for QoS enabled networks Pablo Arozarena Llopis, Javier González Ordás Telefónica I+D, Madrid, Spain {pabloa, javiord}@tid.es

Keywords: Network management; QoS; SLA; DiffServ; Traffic engineering; MPLS Abstract The increasing complexity of IP networks, together with the new requirements imposed on them, creates the demand for new tools and methods for their management and operation. AtriOSS proposes a component based architecture for managing the sophisticated QoS mechanisms that are required to guarantee service quality on certain network configurations; for verifying that the actually delivered quality meets the signed SLAs and to support traffic engineering router capabilities.

1. Introduction This paper describes AtriOSS, an IP network management system which is currently under development as part of the European IST project ATRIUM1. ATRIUM (A Testbed of Terabit IP Routers running MPLS over DWDM) provides an advanced pan-European IP network for research on traffic engineering and QoS aspects of networking [1]. AtriOSS is conceived to manage the ATRIUM core network, with special emphasis on its QoS capabilities and how to support stringent Service Level Agreements.

2. Rationale During the last few years IP networks have faced a significant shift in the type of traffic they carry. The old model, based on best-effort traffic, flat billing and simple application, is being replaced by the concept of next generation network, capable of transporting a mixture of traffic over a converged infrastructure. This evolution in the network capabilities needs to be accompanied by new ways of operating and managing the network, which in turn requires the availability of new tools with sophisticated functionality. While certain areas of the TMN model are well covered by traditional management tools, network management support for state of the art protocols and mechanisms, such as MPLS or DiffServ [2], is not keeping pace with the development and field deployment of the mechanisms itself. The consequence is that the improvements in efficiency and flexibility that those features should bring to the network are not in practice as high as initially expected. Lack of end-to-end visibility, slow reaction to changes in traffic patterns and the impact of differences in vendor implementations are some of the causes. AtriOSS mainly tackle the topic of quality of service (QoS) management. QoS mechanisms, and in particular those defined within the DiffServ framework allow to tune the network so that it can provide the appropriate treatment to each flow in the network depending on network parameters such as its current load and the particular characteristics of the flow. It must be noted that there are studies arguing that a relatively small amount of overprovisioning in the core eliminates the need for QoS mechanisms, even for voice traffic [3]. However it is not clear that this will still hold true when more bandwidth demanding traffic patterns become widespread in the network (in Atrium, massive filetransfer protocols and a multi-videoconference application consuming up to 14 Mbit/s per user, 70 Mbit/s per conference, have been demonstrated). AtriOSS has the objective to simplify the management of QoS mechanisms to help making them a more cost-effective alternative to network overprovisioning. Traffic engineering (TE) plays a secondary role on AtriOSS. TE goal is to achieve a more efficient usage of network resources. Modern IP routers implement algorithms and protocols, both routing [4] and signalling [5] ones, to support distributed MPLS traffic engineering approaches. These approaches often assume the existence of a set of LSPs in the network, and try to optimise either the route of the LSP [6], the assignment of traffic to the LSP [7], or the route of the established LSPs based on their known bandwidth needs [8]. But they all still need an external agent to request the set-up of the LSP with a set of parameters (bandwidth, priority, …). AtriOSS will play the role of such an agent. Finally, AtriOSS will serve as the often missing glue between raw data from network measurements (see [9] for the IETF framework about them) and SLA relevant metrics [10].

3. AtriOSS functionality Six main functional blocks have been identified:

1

This paper describes work in progress in the context of ATRIUM, IST-1999-20675, a research and development project under the IST programme. The IST programme is partially funded by the Commission of the European Union.













Network performance management. It has a twofold goal. On the one hand, it aims to provide the proper tools to identify what quality related problems are affecting the network (essentially caused by congestion), where are they located and which is their impact on the carried traffic. On the other hand, AtriOSS itself could use this information to quickly and proactively reconfigure the network, preventing customers to perceive the problem and SLAs to be compromised. Two sources of information will feed AtriOSS network performance databases. Local measurements taken by routers and end-to-end active measurements collected by distributed monitoring stations. These are arranged in a meshed fashion and are capable of measuring packet loss, one way delay, one way delay variation and round trip time. Service quality management. Post-processing of raw measurements will allow estimating the actual quality delivered to customers. This will be compared against SLA defined parameters; the contents of each signed SLA will represented by AtriOSS in the form of a structured, formal Service Level Specification (SLS). An SLS instance will contain the quality parameters under agreement, the objective values for each parameter, penalty clauses if applicable and any other information that may affect the service (for instance, considerations on planned outages treatment). Traffic engineering. It will play a complimentary role to the constraint based routing capabilities incorporated by the routers. Using the taxonomy provided in [11] AtriOSS can be described as a state-dependent, online, centralised, global, prescriptive, closed-loop, tactical TE system. An additional essential characteristic is that AtriOSS will use MPLS to carry out traffic engineering related actions. It must be noted that AtriOSS is only meant to be a part of a broader traffic engineering system. This broader system must include at least the traffic engineering functionality supported by the routers, in which AtriOSS will rely. Network inventory. AtriOSS will provide a full view of the network under management. It will keep an inventory of both the physical (nodes, racks, cards, etc…) and the logical (LSPs, tunnels, subnetworks, etc.) network entities. The accuracy and completeness of this information is crucial for the system since all other functionality relies on it. Service provisioning. Although it is not AtriOSS' aim to implement a full service management system, it must address some of the functions traditionally associated with this management layer as support for its SLA management capabilities. AtriOSS will differentiate four phases in the provisioning process: service definition, service deployment, service subscription and service usage. Fault management. AtriOSS will support the collection, synchronisation, filtering, logging and presentation of alarms coming as SNMP notifications from network nodes. Alarms will be modeled using X.733 [12] format.

4. AtriOSS architecture Figure 1 provides an overview of AtriOSS internal architecture. It exhibits a hybrid HTML-Java lightweight client and a distributed CORBA-based server. This server comprises up to five application components (for fault, configuration and performance management functions), a number of supporting blocks and an independent network connector to hide device specific protocols and information models (in this case, those of Alcatel's 7770 RCP router and of the monitoring stations).

5. Future work

Figure 1. AtriOSS architecture overview

The system is currently under development. The phases of requirements definition, system specification and architecture design have been finalised. Basic inventory and fault management functionality was early available by adapting already existing components. Due to the current state, no conclusions have been drawn yet. Extensive support for QoS management will be added in the months to come, and a testing phase will follow. The project will end in November 2003.

References [1] http://www.alcatel.be/atrium Last access: April 2003 [2] Blake et al, "An Architecture for Differentiated Services", RFC 2475, IETF, December 1998 [3] Fraleigh et al, "Provisioning IP Backbone Networks to Support Latency Sensitive Traffic", March 2003 [4] Katz et al, "Traffic Engineering Extensions to OSPF Version 2", IETF draft, work on progress, October 2002 [5] Awduche et al, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, IETF, December 2001 [6] Xiao et al, "Traffic engineering with MPLS in the Internet", IEEE Network magazine, March 2000 [7] Elwalid et al, "MATE: MPLS Adaptive Traffic Engineering", 2001 [8] Blanchy et al, "Routing in a MPLS network featuring preemption mechanisms", February 2003 [9] Paxson et al, "Framework for IP Performance Metrics", RFC 2330, IETF, May 1998 [10] Cselenyi et al,"Inter-operator interfaces for ensuring end to end IP QoS", Eurescom P1008, D3, May 2001 [11] Awduche et al, "Overview and Principles of Internet Traffic Engineering", IETF RFC 3272, May 2002 [12] "System Management: Alarm Reporting Function", ITU-T recommendation X.733, 1992

Suggest Documents