CMon: Multi-Domain Circuit Monitoring System Based on ... - Terena

34 downloads 18733 Views 265KB Size Report
CMon: Multi-Domain Circuit Monitoring System Based on GÉANT .... high-level topology and the metadata (CID, bandwidth, duration, source, destination, etc.) .... The NoSQL open-source database used is MongoDB, which facilitates fast and.
CMon: Multi-Domain Circuit Monitoring System Based on GÉANT perfSONAR MDM Hao Yu Technical University of Denmark (DTU), Oersteds Plads 343, Kgs. Lyngby, 2800, Denmark E-mail: [email protected] Phone: +45-4525-3635 Trupti Kulkarni DANTE, 126-130 City House, Hills Road, Cambridge, CB2 1PQ, UK E-mail: [email protected] Phone: +44-1223-371350 Feng Liu LRZ, Boltzmannstrasse 1, 85748 Garching b. München, Germany E-mail: [email protected]

Paper type Research paper.

Abstract Global research collaborations today require robust, secure and dedicated network connections to facilitate data communication between collaborating partners. In order to manage and process such large amount of data, dedicated connections are required to transport data in a highly efficient manner. Managing such links that often traverse multiple geographically spaced domains with heterogeneous network infrastructure poses many compelling research challenges, one of which is interdomain network monitoring. In this paper, a multi-domain circuit monitoring system, CMon, is introduced to address the topic of multi-domain circuit monitoring. Building upon some services provided by GÉANT perfSONAR MDM, CMon aims to provide end-to-end circuit monitoring services with flexibility, extensibility, and vendor independency, regardless of the underlying circuit provisioning systems, keeping in mind the future enhancements of the system itself. The architecture of CMon, by using measurement federations, can adapt to either changes in the circuit provisioning system or the expansion of network size with little or no changes to the CMon system as a whole. The tool can be seen as a value addition to network monitoring suite, bringing in circuit monitoring as well into the fold. Further research possibilities include more distributed architecture for scalability, incorporating passive measurements by way of robust protocols and/or network management tools, and also expansion of monitoring ability to multi-domain VPNs, multicast networks etc.

Keywords Circuits; static, dynamic; multi-domain, network monitoring, performance measurement

1. Introduction Global research collaborations today require robust, secure and dedicated network connections to facilitate data communication between collaborating partners. Layer 3 (L3) Internet Protocol (IP) connections usually lack the ability to provide quality-of-service (QoS) essential to manage parameters such as bandwidth, jitter and packet loss. In order to manage and process such large amount of data with QoS guarantee, dedicated connections (circuits) are required to transport data in a highly efficient manner. Systems that dynamically provision such circuits are referred to as circuit provisioning systems (CPSs). Different domains have different CPSs, e.g. Automated Bandwidth Allocation across Heterogeneous Networks (AutoBAHN) (Lukasik, J., 2011) and Ondemand Secure Circuits and Advance Reservation System (OSCARS) (Guok, C., 2006). Using these tools, circuits with a set of requirements (bandwidth, maximum transmission unit (MTU), delay, ingress and egress virtual local area network (VLAN) identifiers, duration, etc.) can be set up on-demand, thus providing the Bandwidth-on-Demand (BoD) service. In addition to these dynamically created BoD circuits, circuits that are provisioned in a static way (static circuits) also exist across multiple domains to provide long-term data services. Performances of both types of circuits are required to be monitored in order to provide a complete data service.

However, it is challenging to monitor the performances of the dynamically provisioned circuits due to the various technologies used in different domains, collaboration challenges, policy issues, etc. To tackle these problems of multidomain circuit monitoring, multidomain Circuit Monitoring system (CMon) is built using services provided by perfSONAR to achieve greatest flexibility, extensibility, compatibility and vendor independency. This paper will present the architecture and design of CMon. Section 2 describes the background, section 3 presents the architecture of CMon and section 4 discusses the implementation and deployment of CMon. Finally, Section 5 concludes the paper.

2. Background Two components, Bandwidth-on-Demand (BoD) service and perfSONAR, are key to CMon’s conception and therefore design and architecture. The CMon system looks to a CPS to retrieve circuit and domain information, and using that information, interfaces with perfSONAR modules to monitor the circuits. These two components are briefly explained below. The BoD service is an end-to-end, point-to-point connectivity service that is used for data transport. The motivation behind creating such a service was to allow different domains that were spread geographically, and also differed in technology used in their networks, to create circuits dynamically and on-demand with specified bandwidth and duration. Such circuits are more commonly known as dynamic circuits. As one of the CPSs, AutoBAHN was launched as a pilot in the GÉANT2 project (2005-2009). Please see (Lukasik, J., 2011) for more information. perfSONAR is a framework for multi-domain network performance monitoring. The web services-based infrastructure was developed by collaboration between Internet2, ESNet and the GÉANT2/GÉANT3 projects with the objective of providing multi-domain performance information. Please see (Yampolsky, M., 2007) for more information. As aforementioned, CMon will use perfSONAR services to gather information in order to implement a vendorindependent solution for multidomain circuit monitoring. Some modules of CMon also directly interact with the underlying CPS. Part of CMon also directly interacts with network devices, such as in domains where perfSONAR is not available, in order to collect monitoring data.

3. Circuit Monitoring 3.1 Circuit Description Describing a circuit is of great importance in multidomain environments because it directly links to identification uniqueness. A multidomain circuit consists of several segments, each of which is administrated by a domain. Thus a unique circuit identifier (CID) and several segment identifiers (SID) are required to fully define a circuit. To correctly describe the topology, a circuit descriptor and several segment descriptors are needed. Together, these four elements fully and uniquely describe a multidomain circuit. Details are as presented below: • Circuit Identifier (CID) – The CID uniquely defines a single multi-domain circuit. The format of this identifier is a universal resource name (URN) of the Global Lambda Integrated Facility (GLIF) naming format, containing a domain name part and a domain-specific part: GLOBAL-­‐CID  =  urn:ogf:network::    =  @ • Circuit Descriptor – The circuit descriptor is an extensible mark-up language (XML) file containing the high-level topology and the metadata (CID, bandwidth, duration, source, destination, etc.) of a circuit, defined in NML (Network Markup Language). • Segment Identifier (SID) – SID is used to uniquely identify a segment of a circuit. SID also has the format of URN GLIF: GLOBAL-­‐SID  =  urn:ogf:network::    =  @:   • Segment Descriptor – The segment descriptor, also defined in NML, contains a list of network elements involved in the reservation within a domain. It contains detailed information of the topology of a

segment, from a device to a port. It is up to each domain to decide how much information of the segment should be revealed according to its own policy.

3.2 Architecture BoD circuits are dynamic because they are established and released on-demand. Resources may change each time even if the two end points remain the same. Therefore measuring probes may not always be the same one. This requires CMon to be flexible and robust enough handle the changes. The architecture is thus presented. CMon is designed to utilize the services of perfSONAR because it can give CMon great flexibility, extensibility and vendor independency. For a multidomain circuit monitoring tool, where the underlying provisioning techniques can vary greatly, it is vital to keep in mind that the less interaction with CPS, the better for CMon. Otherwise, the progress of development and deployment will be compromised and CMon will have too many factors to consider, taking it away from the core functionality. With the aforementioned points, the architecture of CMon is presented as shown in the Figure 1, consisting of three groups of functions: • Circuit Provisioning Functions – One element of a circuit provisioning system (e.g. OSCARS, AutoBAHN) is crucial from circuit monitoring perspective: The inter-domain provider agent (IPA) is a technology-agnostic function that interfaces with other IPAs in different domains to establish the participating domains in the end-to-end circuit. • Network Monitoring Functions: perfSONAR has several functions that are used by CMon, i.e. MA and lookup service (LS), please refer (Zurawski, J., 2011) for more. • Circuit Monitoring Functions: For circuit monitoring additional functions are required; these are twofold: one, to interface the monitoring service to the underlying data plane or monitoring system to collect monitoring information and data measurements, and two, to interface with CPSs to collect circuit reservation and termination information. CMon Headquarters (HQ) is the function responsible for gluing the CPS and the network monitoring system, and gathering the measurement data of circuits. HQ collects and stores measurement data from either CMon Agents (AGTs) or MAs, and also fetches circuit reservation and termination information. Also, HQ registers itself in the gLS so that it can be searched by other perfSONAR services and also to search AGTs. The AGT also registers itself in the gLS and gathers monitoring data from either MA or already-existing monitoring proxies and sends the data in a formatted XML file to HQ periodically. SNMP traps from the data plane will also be handled directly by the AGT. Such traps or any information gathered via SNMP polling are done by the local AGT in the domain, hence keeping the interface with data plane secure and contained within that domain.

Figure 1 CMon Architecture.

CMon HQ collects and aggregates monitoring data that can be provided by the domains to the Agents installed on their infrastructure via an XML file, or 3rd party monitoring proxy. It can also use SNMP, traps and polling, to securely gather and transfer such information. The data it gathers is such that both status and measurement of the circuit can be reported on. Hence, to this effect, it collects interface status, errors/discards at each interface, number of octets received and sent, and also performs calculations to compute bandwidth (please note: this is only where such information is not available via perfSONAR). This information is stored in the HQ in an intelligent manner so that it can be retrieved against a given end to end multi-domain circuit when queried for by a user via the GUI.

3.3 Communication Protocol The whole circuit monitoring process is divided into three phases: circuit notification, circuit monitoring, and circuit termination. Circuit notification phase begins from the time when a circuit comes into its active mode until CMon starts gathering monitoring data. Described in Figure 2, each IPA sends HQ a circuit request message (CRM) with establishment information, including the circuit descriptor of the circuit and a segment descriptor of the segment in its domain. To acknowledge the reception of the CRM, HQ returns an OK message. Then HQ looks up the addresses of all the related AGTs in the hLS. Once the addresses are obtained, HQ sends begin command to the AGTs. IPA_1

IPA_2

IPA_3

CRM OK CRM

hLS

HQ

OK CRM OK

AGT_1

AGT_2

AGT_3

Look up AGT addr Look up AGT addr Look up AGT addr cmd: begin OK cmd: begin OK cmd: begin OK

Figure 2 Circuit Notification Phase. After the circuit notification phase, CMon enters into the circuit monitoring phase. The length of this period is the same as the life of the circuit. During this period, AGTs gather monitoring data among MA, monitoring proxies, and the data plane, according to different situations. Periodical SNMP polling is used to fetch data such as traffic volume and bandwidth utilization, which is less time-critical. SNMP trap is used to gather time-critical data, like link status. As soon as the status of an interface changes, from up to down or the other way around, the AGT can quickly receive the SNMP notification about the change. Thus, CMon can be very sensitive to link status.

CMON GUI

IPA

Monitoring Proxy

AGT

HQ

Data plane

MA

cmd: begin

data data

polling interval

200 OK

data data alert: failure

alarm

cmd: fetch

cmd: fetch

data

data

cmd: fetch

cmd: fetch

data

data SNMP trap: link failure

data data

data data

cmd: fetch

cmd: fetch

data

data

cmd: fetch

cmd: fetch

data

data

. . .

. . .

Figure 3 Circuit Monitoring Phase. When a circuit expires or is manually terminated, CMon will enter the circuit termination phase for this circuit. Each IPA sends a circuit termination message (CTM) to HQ to indicate the termination of a circuit. The CTM includes the same information as a CRM, a circuit descriptor and a segment descriptor, with unique CID and SID. Upon receiving the CTMs, HQ returns OK messages for acknowledgement. Then HQ commands all the related AGTs to stop gathering monitoring data by sending the end command. IPA_1

IPA_2

IPA_3

AGT_2

AGT_1

HQ

AGT_3

data data data CTM OK CTM

OK CTM OK

cmd: end OK cmd: end OK cmd: end OK

Figure 4 Circuit Termination Phase.

4. Implementation and Deployment Current CMon implementation is Procedure Calls (RPCs) are used information regarding circuits are Object Notation (JSON) format. simple data accesses.

in Python given its flexibility and support for fast-prototyping capability. Remote for the communication between AGTs and HQ. Monitoring data as well as metapersisted in a NoSQL database, in which data entries are modeled using JavaScript The NoSQL open-source database used is MongoDB, which facilitates fast and

A version of CMon for static circuits was completed and demonstrated at the GN3plus Symposium in October 2013 in Vienna, Austria. The CMon HQ was located at LRZ, Germany, and 3 agent deployments were made at DANTE (for GÉANT), NORDUnet, and DFN purely for demonstration purposes. Monitoring details of static circuits were displayed on GUI and monitoring data was collected via XML files. The other version of CMon for dynamic circuits is underway in cooperation with the AutoBAHN development team

and some NRENs for the pilot. In order to make CMon sensitive to interface failures, SNMP traps are used to quickly respond to any change on the interface status. For less time-critical data, such as traffic volume and bandwidth utilization, SNMP polling is used. Apart from the static circuits, a definite use case for CMon is AutoBAHN, which is the tool used to provide BoD (Bandwidth on Demand) service in GÉANT. This tool has a wide usage for a number of projects, such as Exemplar, the Irish next generation network. Apart from this, it is also used by a number of NRENs providing services of this nature to end-users who use such dynamically created circuits for research purposes and data exchanging. It is highly important that such links are monitored, in order to provide the highest possible service to the participating NRENs and users both. CMon aims to support this important service by providing a reliable monitoring service.

5. Conclusion Multi-domain network monitoring is essential in order to facilitate research and collaboration. CMon, using services of GÉANT perfSONAR MDM provides greatest flexibility and extensibility to adapt to either further changes in the CPSs or the expansion of the NREN community. By using federated measurement services already in place, circuit monitoring in a multi-domain environment thus becomes realistic and beneficial to its users. Commercial COTs products certainly do not cater to this field, and no doubt there may be research and development in the worldwide community regarding end to end monitoring, but CMon is being developed as a unique solution in its usage of federated measurement services for the purpose. It is also being developed to be equipped with other methods such as SNMP, OAM and API to a domain’s own NMS (Network Monitoring System), this in effect making the solution flexible and agnostic at the same time to be able to be deployed in any network. The architecture too has been designed keeping in mind future possible ways of collecting monitoring information. The future research and development plan will focus on areas such as intuitive user interface, data verification & access control, and scalability.

Acknowledgements The research leading to these results has received funding from the European Community’s Seventh Framework Program (FP7 2007–2013) under Grant Agreement No. 238875 (GÉANT).

References Lukasik, J., 2011, Autobahn overview. Available: http://geant3.archive.geant.net/service/autobahn/Resources/Pages/ Resources.aspx Zurawski, J. and Brown, A., 2011 perfSONAR Introduction. Available: http://www.internet2.edu/workshops/npw/materials/WJT2011/20110202-NPW-pS-Introduction.pdf Yampolsky, M. and Hamm, M. K., 2007, Management of multidomain end-to-end links-a federated approach for the pan-european research network geant 2. IM’07. 10th IFIP/IEEE International Symposium on Integrated Network Management, IEEE, 2007, pp. 189–198. Guok, C., Robertson, D., Thompson, M., Lee, J., Tierney, B., and Johnston, W., 2006. Intra and interdomain circuit provisioning using the oscars reservation system. Broadband Communications, Networks and Systems.

Biographies Hao Yu joined DTU (Denmark) as a Postdoc researcher in April 2011. His research interests include network architecture for Future Internet, network management, software-defined networking, etc. Hao has a PhD in Telecommunications.

Trupti Kulkarni joined DANTE in April 2009. She is a Senior Software Engineer and is responsible for design, development and maintenance of various applications used for inventory management and monitoring of the GÉANT network. Trupti has a Bachelors degree in Computer Science and Engineering. Feng Liu joined LRZ (Germany) as Research Scientist in August 2011. His research fields include Future Internet, Software defined Networks and self-managing /autonomic capabilities of communication networks. Feng has a PhD in Computer Science from Ludwig-Maximilians Universität München, Germany.