The Missing Link: Putting the Network in Networked Cloud Computing

3 downloads 84947 Views 142KB Size Report
Putting the Network in Networked Cloud Computing. Ilia Baldine Yufeng Xin Daniel Evans Chris Heerman. Renaissance Computing Institute (RENCI).
The Missing Link: Putting the Network in Networked Cloud Computing Ilia Baldine Yufeng Xin Daniel Evans Chris Heerman Renaissance Computing Institute (RENCI) Jeff Chase Varun Marupadi Aydan Yumerefendi Department of Computer Science Duke University

1. INTRODUCTION

The backbone of IT infrastructure is evolving towards a service-oriented model, in which distributed resources, either software or hardware, can be composed as a customized IT service on demand. In particular, cloud computing infrastructure services manage a shared “cloud” of servers as a unified hosting substrate for diverse applications, using various technologies to virtualize servers and orchestrate their operation. Emerging cloud infrastructure-asa-service efforts include Eucalyptus, Nimbus, Tashi, OpenCirrus, and IBM’s Blue Cloud. Extending cloud hosting into the network is a crucial step to enable on-demand allocation of complete networked IT environments. This paper reports on our effort to advance cloud resource control to cloud networks with multiple substrate providers, including network transit providers. Our vision is to enable cloud applications to request virtual servers at multiple points in the network, together with bandwidth-provisioned network pipes and other network resources to interconnect them. This capability is a significant advance beyond the cloud infrastructure-as-a-service models that are generating so much excitement today. This paper reports on a RENCI-Duke collaboration (http://www.geni-orca.renci.org) to build a cloud network testbed for the Global Environment for Network Innovation (GENI) Initiative recently launched by the National Science Foundation and BBN. GENI (http://www.geni.net) is an ambitious futuristic vision of cloud networks as a platform for research in network science and engineering. A key goal of GENI is to enable researchers to experiment with radically different forms of networking by running experimental systems within private isolated slices of a shared testbed substrate. A GENI slice gives its owner control over some combination of virtualized substrate resources assigned to the slice, which may include virtual servers, storage, programmable network elements, networked sensors, mobile/wireless platforms, and other programmable infrastructure components attached to the cloud network. GENI slices are built-to-order for the needs of each experiment. We focus on progress in building a unified control framework for a prototype GENI facility incorporating RENCI’s optical network stacks on the Breakable Experimental Network (BEN). BEN is a testbed for open experimentation on dedicated optical fiber that spans the Research Triangle area, and links server clusters on each campus. We have demonstrated a key milestone: on-demand creation of complete end-to-end slices with private IP networks linking virtual machines allocated at multiple sites (RENCI, Duke, and UNC). The private IP networks are configured within stitched layer-2 VLANs instantiated from the BEN metro-scale optical network and the National Lambda Rail (NLR) FrameNet service. In the context of GENI, this capability enables a researcher to conduct safe, reproducible experiments with arbitrarily modified network protocol stacks on a private isolated network that meets defined specifications for the experiment. 2. A CONTROL FRAMEWORK FOR A MULTI-LEVEL CLOUD NETWORK

Our ultimate goal is to manage the network substrate as a first-class resource that can be co-scheduled and coallocated along with compute and storage resources, to instantiate a complete built-to-order network slice hosting a guest application, service, network experiment, or software environment. The networked cloud hosting substrate can incorporate network resources from multiple transit providers and server hosting or other resources from This work was supported by the National Science Foundation GENI Initiative, NSF award CNS-0509408, and an IBM Faculty Award.

multiple edge sites (a multi-domain substrate). Cloud networks present new challenges for the control and management software. How to incorporate diverse substrate resources into a unified cloud hosting environment? How to allocate and configure all the parts of a guest environment—a slice of the cloud network—in a coordinated way? How to “stitch” interconnections among substrate resources obtained from different providers to create a seamless end-to-end slice? How to protect the security and integrity of each provider’s infrastructure, and protect hosting providers from abuse by the hosted guests? How to verify that a slice built to order for a particular guest is in fact behaving as expected? How to ensure isolation of different guest slices hosted on the same substrate? How to provide connectivity across slices when connectivity is desired, and police the flow of traffic?

2.1 BEN Substrate

IP networks are often deployed as overlays on dedicated circuits provisioned from an underlying network substrate. Networks that support both IP overlays and dynamic circuit provisioning are known as hybrid or multi-layer networks. The regional Breakable Experimental Network (BEN) is an example of a multi-layer optical network. In 2008, the Triangle Universities (UNC-CH, Duke and NCSU) in collaboration with RENCI (Renaissance Computing Institute) and MCNC began the rollout of a metro-scale optical testbed. BEN consists of dark fiber, provided by MCNC, interconnecting sites (BEN PoPs) at the three Universities, RENCI, and MCNC. It provides access for university researchers to a unique facility dedicated exclusively to experimentation with disruptive technologies. RENCI has installed access equipment at each of the BEN PoPs, based on Polatis fiber switches that mediate access to the shared fiber. Above each Polatis switch, RENCI maintains a default stack of network equipment that can provision dynamic circuits between pairs of PoPs, and instantiate layer-2 VLANs and IP connectivity across those circuits. Figure 1 depicts the stack of network elements at each BEN PoP, reflecting the multiple layers of the BEN network. At the bottom of the stack is an all-optical fiber switch, in the middle an optical transport network switch (Infinera DTN), and at the top an Ethernet switch (Cisco 6509). The BEN network architecture defines adaptations at each layer. Figure 2.1 shows the the functional diagram of the layer stack. The Infinera DTN is equipped with multiple 10 Gigabit Ethernet (10 GE) client-side interfaces that connect to the 10 GE line-side interfaces of the 6509, which itself exposes multiple 1 Gigabit Ethernet (1 GE) client-side interfaces. The DTN first adapts the 10 GE signal into wavelength, then multiplexes 10 wavelengths into an internal channel group, then multiplexes up to four channel groups onto a line-side fiber. BEN includes a secure management plane, a private IP network for communicating with control interfaces on the various network elements. These control interfaces accept management commands to provision circuits, link them together into well-formed networks, and expose them as VLANs at the BEN edge. Some of the BEN PoPs also have links to NLR FrameNet endpoints, which can be used to link VLANs through NLR’s national-footprint network and connect them with the VLANs hosted on BEN.

2.2 ORCA Control Framework

Our control framework software is based on the Open Resource Control Architecture ( ORCA) [Irwin et al. 2006; Chase et al. 2007; Yumerefendi et al. 2007; Chase et al. 2008; Constandache et al. 2008; Lim et al. 2009], an extensible platform for dynamic “leasing” of resources in a shared network infrastructure. The O RCA platform is in open-source release as a candidate control framework for GENI, and is a basis for ongoing research on secure cloud computing and autonomic hosting systems. For this project, we developed plug-in handler extensions for O RCA to control BEN network elements by issuing commands over the secure management plane. We also developed plug-in resource control extensions to coordinate allocation of BEN circuits and VLAN tags, and to oversee VLAN linkages. Finally, we extended virtual machine handlers in O RCA to connect virtual machines to VLANs, and configure them as nodes in an IP network overlaid on VLANs. In this way, a guest can request the O RCA service to allocate virtual machines on server sites adjacent to the BEN PoPs on each campus, link them to a transit network dynamically provisioned from BEN, and configure them to form a complete private IP network. Users can build a network through a Web portal interface, or using a programmed slice controller that interacts with O RCA resource servers to build and control their custom network.

Network Functional Diagram

6509

1GE Link Connection: 1 GB

10GE TAM 10 GE Link Connection: 10 GB

BMM OCG-1

DTF-1 OCG

DTF-2

10 =100GB OCG-2

Link Connection: 1

(DTF) = 10 GB

OCG-3 OCG-4

DTF-10

Line

Infinera

OCG Link Connection: 1 OCG = 10

= 100 GB

4OCG =400GB Reconfigurable Fiber Switch Fiber Link Connection: 4 OCG = 40

(a) BEN PoP Network Element

= 400 GB

(b) Layer Adaptation Functional Diagram

Fig. 1. Network elements in each PoP of BEN, a multi-layer transport network. 2.3 A Language for Cloud Networks

One focus of the project is to advance standards and representations for describing network cloud substrates declaratively. There is a need for a common declarative language that can represent multi-level physical network substrate, complex requests for network slices, and the virtualized network resources (e.g., linked circuits and VLANs) leased for a slice, i.e., allocated and assigned to a slice. Ideally, we could specify all substrate-specific details declaratively, so that we can incorporate many diverse substrates into a network cloud based on a generalpurpose control framework and resource leasing core. Declarative representations are difficult in this domain because of the need to express complex relationships among components (e.g., network adjacency), properties and constraints of each network level, and constraints involving multiple levels. Our approach extends the Network Description Language (NDL [Ham et al. 2008]). NDL representations are documents in RDF (Resource Description Framework), a syntax for describing sets of objects and their properties and relationships (predicates). NDL is an ontology: a set of resource types and relationships (properties or predicates) that make up a vocabulary for describing complex networks in RDF syntax. An NDL document uses the NDL vocabulary to specify a set of resource elements and relationships among them, whose meanings are defined by NDL. NDL has been shown to be useful for describing NDL heterogeneous optical network substrates and identifying candidate cross-layer paths through those networks. One contribution of the project is to extend NDL to use a more powerful ontology defined using OWL (Ontology Web Language). The result is an NDL-compatible extension of NDL which we refer to as NDL-OWL. The ultimate goal of this process is to create a representation languages that is sufficiently powerful to enable generic resource control modules to reason about substrate resources and the ways that the system might share them, partition them, and combine them. Each resource control action, such as allocating or releasing resources for a slice, affects the disposition of the remaining substrate inventory. To meet our goals, the declarative representation must also capture these substrate-specific constraints on allocation and sharing. These constraints are crucial for the resource control plug-in modules in O RCA, which are responsible for allocating and configuring substrate resources for each slice. OWL is an RDF vocabulary for describing ontologies. The power of OWL derives from a rich vocabulary

for defining relationships among the resource types and among the predicates in the ontologies that it describes. In addition to hierarchical classes and predicates, OWL introduces logic-expressive capabilities including class constraints like disjointness, intersection, union, and complement, property constraints like transitive, symmetric, inversive, cardinality, etc. An OWL ontology uses these capabilities to define the structure and relationships of predicates and resource types that make up the ontology’s vocabulary. Given knowledge of these relationships in an ontology, an inference engine can ingest an RDF document based on the ontology, and manipulate it or infer additional properties beyond those that are explicitly represented in the document. For example, in NDL-OWL, the hasInterface and interfaceOf propertiesare related in the ontology using the inverseOf property axiom in OWL: thus software can infer the property in one direction from a statement that the inverse property holds in the other direction. We use the Transitive property axiom in OWL to define connectivity and adaptation properties. These features are useful for path finding algorithms. For example, if a sequence of pairs of points are connected, an end-to-end path can be inferred. RDF and OWL were developed as core technologies for the Semantic Web, and are widely used W3C standards [Antoniou and Harmelen 2008]. They are powerful, flexible, and expressive formalisms for representing structured knowledge. They are especially suitable to model graph structures such as complex network clouds. We have developed an ontology-based cross-layer network provisioning service system that contains following components: (1) a suite of ontology (NDL-OWL) that can describe various network and compute resources; (2) representation of user requests and allocated subnetworks (slices) at multiple abstract levels; (3) available and used resource abstraction and accounting that integrates with policy controller interfaces in the O RCA control framework; (4) common end-to-end path and virtual topology mapping and release APIs that can generate schedules of configuration actions to the network elements. 3. NDL-OWL

We emphasize a common suite of ontology elements that can describe the physical network substrate, requests for allocations of slice resources from the cloud network, and the current configuration of a partially allocated substrate after satisfying some set of requests. NDL, the basis for our work with NDL-OWL, is sufficiently powerful to express network topology and connectivity in multiple layers or levels of abstraction. NDL also models the adaptations between layers in a multi-layer network setting (see Figure 2.1). For example, each transport service at a layer (WDM, SONET/SDH, ATM, Ethernet, etc.) supports some set of defined adaptations, e.g., different styles of Ethernet over WDM (e.g., 10GBaseR) and VLAN over native Ethernet. Consistent and compatible adaptations between layers must be present to establish connectivity along a path. The fundamental classes and properties in NDL include the Interface class, Adaptation class to define the adaptation relationship between layers, connectedTo and linkedTo predicates to define the connectivity between instances of Interface, and the switchedTo predicate to define the cross connect within a switching matrix among a groups of interfaces. A valid path between two devices normally comprises a sequence of triples with property combinations of hasInterface, adpatation, connectedTo, linkTo, adaptationOf, interfaceOf. 3.1 Accounting for Dynamic Provisioning

In addition to specifying the topology of the network substrate and adaptations between layers, an NDL-OWL model incorporates important concepts necessary for dynamic service provisioning, such as capacity of a network resource, e.g. bandwidth and QoS attributes. One important concept added by NDL is Label, an entity to distinguish or identify a given connection or adaptation instance among others sharing a given network component. For example, some labels correspond to channel IDs along a physical link, e.g., a particular fiber in a conduit, a wavelength along a fiber, or a time-slot in a SONET or OTN frame. Labels may be viewed as a type of resource to be allocated from a label pool associated with each component. The label range is fixed and a particular physical channel has fixed capacity. For example, for the 802.11q tagged Ethernet network, the VLAN ID is used as the unique resource label and has a fixed range (0-4096). NDL-OWL generalizes the NDL concept of Label to enable dynamic accounting of network resources. We extend the Label class to associate the capacity and QoS characteristics with each transport entity. NDL-OWL defines two properties, availableLabelSet and usedLabelSet, to track dynamic resource allocation. We use the

Suggest Documents