Towards 0-Touch Networks - IEEE Xplore

2011 Technical Symposium at ITU Telecom World (ITU WT)

Towards 0-Touch Networks Antonio Manzalini Telecom Italia – Future Centre Torino, Italy

Corrado Moiso Telecom Italia – Future Centre Torino, Italy

In this context, future networks will evolve towards flatten architectures (less hierarchical) infrastructures. Public Networks will be based on optical core infrastructures (with a limited number of big nodes) inter-connecting different local areas (with optical and/or radio local connectivity) populated at the edge with a myriads of heterogeneous nodes. On the same physical network it will be possible to design and deploy coexisting logical architectures best fitting services demand and load / traffic engineering.

Abstract—Future networks will be based on optical core infrastructures (with a limited number of nodes) inter-connecting different local networks (with optical and/or radio local connectivity) populated with a myriads of heterogeneous nodes with communication, processing and storage capabilities. This evolution will transform future networks into complex systems. Traditional management and control approaches – still valid for the core - will not be applicable anymore at the edge. The paper advocates the introduction of cognitive capabilities at the edge of Telco networks for taming the growing complexity of pervasive nodes at the edge. In parallel, traditional management approach of core networks will be enhanced for mitigating and easing human efforts (reduction OPEX). The paper extends the concepts towards the orchestration of different virtualized resources yielding to a “network of networks” dynamic environment.

Specifically, at the edges of the networks (i.e. in the access segment), nodes will create “networks of networks” made of heterogeneous and highly interconnected (real/virtual) entities (e.g. from sensors to smart things, from Users’ devices to servers, from access gateways to access/metro nodes). In this context, dynamic aggregations of competing (sub-) networks (belonging to the same, or different Operators) will emerge. They will support any sort of services by using (aggregations of) local processing and storage resources according to game models. This evolution will transform future networks (at the edges) into large scale complex systems.

Keywords: autonomic networks, cognitive networks, autoconfiguration, dynamic resource allocation

I.

INTRODUCTION

Internet is increasing its impact on our daily lives progressively becoming an inherent part of strategic areas such as communications, electricity control, transportation, utilities and resources management, healthcare, manufacturing, commerce, etc. In the future, any smart object with embedded communication, processing and storage capabilities will be able to process, store, receive and transmit information to and from other devices (of humans and machines) and objects through omni-pervasive communication networks.

This level of complexity will make networks design and management highly challenging: traditional management and control approaches – still valid for the core – will not be applicable anymore at the edge. Networks should have capability to self-adapt and self-configure themselves (with limited human intervention) to satisfy dynamically changing services demand on communication, processing and storage. Each node (even the smaller ones) will have some intelligent capabilities and nodes will organize around some task to accomplish. This will yield to the aggregation of nodes into dynamic opportunistic networks that will self-organize and will use the available resources in order to accomplish their goals. In this highly adaptive context, the concept of network or systems boundaries will blur in favor of an adaptive complex system view.

A huge wave of data is expected over the next five to ten years due to mass digitalization of Users’ devices and the spread of e-services across government, health, infotainment, and security will become more available and adopted. In addition, new requirements will emerge from residential and business users as well as from scientific and industrial communities driving the deployment of high-performance and pervasive networks.

There is a stringent need to embrace new concepts for future management and control of networked systems: one research path is to exploit self-management and selforganization for easing the communication needs of users and highly distributed pervasive services, i.e., networks that do not require a heavy human intervention.

Along this evolution, future networks will be characterized by the intertwining of IT, Telecommunications and Sensor technologies. Networked systems will use in an interchangeable manner processing, communication and storage capabilities for carrying out highly distributed computational and communication tasks. Computational capabilities will be further enhanced by the possibility to interact with the external world by means of sensors and actuators.

© Copyright 2011 ITU - All rights reserved

Roberto Minerva Telecom Italia – Future Centre Torino, Italy and Institute Telecom Sud Paris, France

A change of paradigms is expected in the design and management of the network: from “maximizing performance with minimum resources” to “keeping good enough performances with less operational costs”. Emphasizing this

69

aspect, we term these networks “0-Touch”. A working definition of a 0-Touch Network is as follows: a complex network that exposes intelligent but simple (and possible distributed) functions that simplify the global manageability of the whole environment and the optimization of comprised components to the advantage of Users and Providers. II.

The network aspects should be coupled by a complementary vision of IT infrastructure evolution. The approach, envisioned in [9], is to create a distributed infrastructure of virtualized computing resources by aggregating heterogeneous nodes, from servers to users’ devices. The transformation of such an ensemble of heterogeneous resources in a distributed computing infrastructure requires the introduction of functions, implementing self-CHOP features for the (0-touch) supervision of IT virtualized capabilities, organized according to distributed (either decentralized or hierarchical) approaches,

FROM SELF-ORGANIZING TO 0-TOUCH NETWORKS

Internet has been considered the first example of selforganizing and highly robust networks. Actually, today’s IPbased networks are being widely and successfully adopted worldwide. Nevertheless there are still configuration problems (getting more and more complicated) due to the complexity of the control plane running on network elements (e.g. for implementing the distributed routing algorithms) as well as the growth of the management plane (that monitors and configures data-plane mechanisms and control-plane protocols).

In order to introduce 0-touch principles in Telco’s networks, the paper describes an initial set of principles and guidelines that on the long run should contribute to the definition of a shared and consolidated open architecture. Figure 1 depicts the high level functional model of a 0Touch network for the Telco domain.

As an example, consider that the states used to implement the different control and management functions are governed by entities which have to be configured through several lowlevel configuration commands (mostly hand-made); furthermore there are several dependencies among these states and the logic updating the states (most of which are not kept aligned automatically). Multiple studies have shown that configuration errors are a large portion of Operator errors which are in turn the largest contributor to failures and repair time. The more complex the systems to configure will get, the more errors will appear if we keep humans in the loop. For example, IP and Ethernet originally embedded decision logic for dynamic path-computation in distributed protocols, whose complexity incrementally grew with the growth of Internet. Growing complexity and dynamicity in future networks (above all at the edges) poses serious doubts about the efficiency in extending further these distributed control protocols.

Figure 1: 0-Touch Network for the Telco Domain

Resources (e.g., communication, processing, storage) made available by the Telco Domain can be virtualized and their virtual image can be integrated in overlay virtual networks. Each virtual image should be enhanced with control loops and mechanisms in order to make it autonomic.

There are other examples of networks able to adapt to the changes in the environment even if they are confined to specific problems related to connectivity (e.g., MANET, ad hoc networks, wireless sensor networks), or they are aiming at specific applications (e.g., peer to peer overlay network applications like VoIP or file sharing).

The Network Operating System, which is empowering the network nodes, allows self-adaptation and self-organization of the virtualized resources. In particular, virtual networks are setup according to the needs and requirements of customers’ applications; each virtual network will make use of the available resources according to its own policies and needs possibly competing with other virtual network for accessing specific resources, and will offer a distributed virtual execution environment to provide services.

A more interesting example is the Self Organizing Network (SON), introduced as part of the 3GPP Long Term Evolution (LTE). SONs are aiming at reducing the cost of installation and management by simplifying operational tasks through automated mechanisms such as self-configuration and selfoptimization. SONs network solutions are expected to introduce robustness, scalability and response and enable effective integration into the existing operations.

The virtualization of resources enables the clear decoupling of valuable resources from applications. The crash or the misuse of a virtual resource is confined in the virtual network (e.g., by applying fault recovery policies enforced by selfhealing capabilities) and it has no impact on other virtual networks. Each virtual network will put in place network specific logics and behaviors in order to optimize the usage of allocated resources.

Nevertheless there is a possible reverse side of the coin. Provided that we find the way to operate self-managing and self-organizing networks (thus reducing human errors and mistakes), we still have the problem of assuring the network stability: cascading and nesting of self-*mechanisms can lead to the emergence of non-linear network behaviors, leading to instabilities that jeopardize network performances.

The 0-Touch Management Layer is designed to complement the Network Operating System in the management

70

and control functions whilst keeping track of the behavior and capabilities of the entire network. It matches the available capabilities and resources with the dynamic requirements of the virtual networks (and applications insisting on them) supported by the networked infrastructure. In other term this layer is responsible for the global optimization of the resources usage. While a single virtual network optimizes the usage of its allocated resources (trying to benefit the most from them), the 0-Touch Management Layer copes with the optimization and fair distribution of the available resources with respect to the whole set of virtual networks. In other terms, it tries to solve an optimization problem similar to the “Tragedy of Commons”. III.

In the original proposal, there is a separation between the supervisor, which implement the MAPE-K loop, and the system under supervision. This model can be recursively applied to create a cascade of supervisors/systems under supervision, in a hierarchical way. This original MAPE-K model is not suitable to describe the autonomic behavior of a bio-inspired service ecosystem: in fact, it would model an ecosystem where the individuals are structured in strict hierarchies; on the other hand, in nature, such manager-managed relationships are much less relevant to those established in a peer-wise way. Moreover, individuals are able to react independently, i.e., “autonomously”, to most of the external events they perceive, and are able to cooperate by activating peer-to-peer relationships among them.

ENABLING TECHNOLOGIES

From a technological point of view the proposed architecture is based on some emerging technologies. Overlay networking hooking virtual resources represent the sound basis for coping with the challenges of widely pervasive and distributed systems. Coupled with these, other two increasingly important technological breakthroughs are considered: autonomic and cognitive networking (the two approaches are seen as synergic).

Therefore, the evolution of this approach entails to remove the distinction between supervisors/supervised systems, by embedding in each of the system components an autonomic control loop, in charge of monitoring its internal behavior and adapting it to internal events and to the changes in its external environment. In this ways, the components themselves become “self-aware”, by integrating self-CHOP features in their behavior. The strict distinction between managed and managing components is removed, when all the components implement self-management logic, and this makes possible a peer cooperation through interactions with other components in the system.

Autonomic solutions took inspiration from the biological characteristics of the human Autonomic Nervous System which acts and reacts to stimuli independently from the individual’s conscious decision. It regulates the behavior of internal organs (e.g., heart to control blood flow and heartbeat, the stomach and intestines to control digestive movement and secretions). It sends commands to the organ, and, via sensory fibers receives feedbacks on the condition of internal organs, information that helps to maintain “homeostasis”. The Autonomic Nervous System implements a control loop, through which it can react, according to “predefined policies”, to the changes of the internal organs and environmental conditions (e.g., external temperature), by hiding the complexity of the control to the conscious part of the Nervous System.

The embedding of autonomic control loop in the components also enables grassroots approaches: in this case, the control loops are extremely simples, e.g., based on the monitoring and adjustment of just few local variables, and oriented to handle simple exchange of information with the environment through local interactions: the local interactions of larger numbers of components, embedding much simpler control loops, enable the appearance of global system properties.

IBM, as part of its autonomic computing initiative, derived from the metaphor of the Autonomic Nervous System, a management architecture able to cope with the increasing complexity of IT systems. This architecture is based on the socalled MAPE-K (Monitor Analyze Plan Execute – Knowledge) [1] that is implemented in autonomic managers (in charge of controlling any resources). Essentially sensors, probes, etc, feed information to a Monitor function; the system then proceeds to Analyze the information, form a Plan, and then Execute it. This is then fed back to the effectors which exploit the plan of action. Knowledge is a fundamental aspect for the correct functioning of this cycle: it consists of data shared among the MAPE-K functions of an autonomic component, such as symptoms and policies.

Figure 1: Self-* capabilities through autonomic control loops

Cognitive networking is a sort of extension of the autonomic control loop by introducing in the individual resources and in a whole new layer the capabilities for creating, updating, using dynamic knowledge of the behavior of single components, of interconnected systems or the whole network. This knowledge is created and maintained by applying and

A system that implements a MAPE-K control loop is capable of sensing the environment changes and sending commands to adapt the behavior to the environment; in this way it is possible to achieve the self-* autonomic capabilities (e.g. self-CHOP: configuration, healing, optimization, protection).

71

using new semantic and numerical algorithms together with reasoning and learning techniques [2].

physical resources for routing/switching packets/frames, providing links/connections, storage and processing raw capabilities) allowing to implement a virtual network pursuing specific goals.

This approach yields to the definition of a cognitive network inspired to [3] and trying to align to the definition provided in [4]: “a cognitive network is a network with a cognitive process that can perceive current network conditions, and then plans, decides and acts on those conditions. The network can learn from these adaptations and use them to make future decisions, all while taking into account end-to-end goals.” IV.

This vision is compliant with the Software-Defined Networking (SDN) approach being promoted by the Open Networking Foundation (ONF) [6]. The SDN basic idea is to exploit a software interface (i.e. OpenFlow) for controlling how packets are forwarded through network switches, and a set of management interfaces upon which more advanced management functions can be actuated.

DISTRIBUTING 0-TOUCH NETWORK CONTROL LOGIC

Future networks will be more and more pervasive and composed by a myriad of resources. Without a wide introduction of self-* mechanisms, these networks, above all at the edge, will not be manageable anymore, at least with the current paradigms.

V.

IMPACT ON MANAGEMENT PROCESSES

A distinction should be made between core and edge networks. 0-Touch approach can impact traditional management processes, by simplifying and automating them, with a consequent reduction of human efforts (and limitation of human mistakes [10]) and optimization in the achieved performances. A description of the impact of 0-Touch approach on management processes for the core network (Figure 3) is outside the scope of this paper.

Let’s consider a network composed of very simple nodes as smart (metaphorically) as ants, termites: nodes are just performing traffic (packet) forwarding (no control-plane functions), so they are very cheap, and we can have plenty of them. Any device (with standard interfaces) can be a forwarding node. Control decision logic is not hardwired and locked in protocols (like today in IP, which is a weakness for the future) but there is a global network operating system which is able to translate high level network goals (e.g., load-balancing, traffic engineering, survivability and other QoS requirements, etc.) and business objectives into low-level policies, configuration commands (e.g., forwarding tables, automatically downloaded) enforced into nodes [5] (where they are then rapidly actuated, with a limited processing). This is an example of a flat low cost network architecture (building to a certain extent on OpenFlow): it is highly robust and flexible, as based on a great number of simple forwarding nodes. Swarm intelligence (a sort of network operating system) can be exploited on the Cloud – mirroring real nodes – and should be open and standard. Logic neither will be fully centralized nor fully distributed but it will be exploited in a way which allows self-adaptation, by means of dynamic games of coordination – competition..

Figure 3: eTOM processes (TM Forum)

This evolution follows, metaphorically, the evolutionary path of Personal Computers (PC) where the abstractions of the h/w substrate and the virtualization of processing allowed different OS (e.g. Window, Linux, MAC) running on a common, simple and stable h/w (e.g. x86).

On the other end, the extension of the network to the edge and its expected growing complexity make the 0-Touch approach a very promising self-management approach far beyond the traditional processes like the ones reported in eTOM, not applicable at the edge.

Actually this approach helps PC programmability allowing strong isolation whilst letting at the same time the competition to flourish above (at the applications level). Similarly, Active Networking and Network Processors (from mid and late 90, respectively) made the first of several attempts in making a clean separation between a simple common hardware substrate and an open programming environment on top.

At the edge, each device must be able to find and communicate with its peers or remotely across a core network with no configuration or other management effort on the part of the Network Operator or the Users. Current Internet Protocols fall short to pursue this purpose, as a matter of fact, skilled network administrators are still required to deploy and operate IP networks. For instance, BGP routers use the hierarchical structure of IP addresses, aggregating information about distant nodes and networks sharing a common address prefix into a single routing table

Interestingly a new perspective is also emerging, that we call Network Uploading: the network operating system could be uploaded on a hardware network substrate (made of

72

entry. While this hierarchical address assignment scheme makes the core infrastructure efficient and scalable, on the other hand it makes edge networks brittle and difficult to manage. Dynamic address assignment transfers administrative responsibility from edge nodes to DHCP servers, at the expense of making edge nodes unable to communicate at all without access to a DHCP server.

control objective is to ensure that the controllers are cooperative as a group and their outputs (sub-network configurations) all reach a value of consensus which is assuring a resource allocation that guarantees the QoS objectives of the end to end service. Not only the control objective should be reached but it should be stable. 0-Touch features will solve this problem by providing a space for the exploitation of consensus creation (between controllers) and optimization algorithms/methods allowing the provisioning of end-to-end services, even across multiple subnetworks. A second use case is related to the adoption of 0-touch solutions in the dynamic allocation of IT virtual resources provided by a network organized a a distributed cloud. Applications to be executed on the network have to be equipped with the processing and storage resources required to fulfill the QoS requirements (e.g., response time), reliability and scalability. The resource allocation should be performed dynamically in order to immediately react to the changes on application load, faults and variation in end-users’ needs. Dynamic allocation can be driven by a set of rules determining the conditions for assigning and de-assigning resources to an application (the so-called elasticity rules). The conditions are based on predicates on Key Performance Indicator (KPI) on parameters characterizing the configuration and the behavior of an application. A set of supervision functions, based on autonomic control loops, periodically compute the PKI of the applications (by possibly collecting data from distributed systems) and check them with the conditions; when they detect that the condition of a rule is verified (e.g., due to an increment or a decrement of traffic load, a fault, or the need to store new data), they activate the execution of the associated action, such as allocation, deallocation or migration of some resources. The execution of these actions is performed by interacting with a resource control plane in charge of managing the set of available virtual resources and monitoring their allocation, by possibly taking into account optimization constraints (e.g., on application performance or on infrastructure reduction costs). These 0-touch supervision functions would, from one hand, strongly simplify the process related to the provisioning of virtual resources to applications, and from the other hand, optimize the usage of the virtual resources by avoiding underprovisioning and over-provisioning situations by means of the introduction of dynamic KPI-based resource allocation. Examples of cloud infrastructures implementing an elastic allocation of virtual resources are described in [7] and [8]. Extensions should be elaborated in order to deal also with virtual resources provided by devices at the edge of the network: this scenario would require a higher degree of distribution of the monitoring and allocation functions. Moreover, as already mentioned at the beginning of this section the algorithm for the allocation of IT virtual resources should also be enhanced in order to consider requirements on links (e.g., their QoS) to dynamically set-up and tear-down connectivity for interconnecting the IT virtual resources allocated to the same application. Analogously, KPI-based

Either ad-hoc networking protocols by themselves are not sufficient, because they are only scalable to local-area network sizes of a few hundreds or thousands of nodes. There is a need of an approach capable to forward traffic from any node to any without the help of hierarchically structured node addresses. As mentioned, IP and Ethernet originally embedded decision logic for dynamic path-computation in distributed protocols, whose complexity incrementally grew with the growth of Internet. 0-Touch feature of the Network Operating System introduces a layer on top of IP that allows overcoming said limitations, making edge networks self-organising and self-managing and thus reducing the need of skilled network administrators. Moreover, the management of IT resources at the edge requires flexible and dynamic solutions in order to be able to rapidly react to the changes on resources/connectivity availability and on the changes on application needs. The two use-cases reported in the next section will provide some concrete example of exploitation of 0-Touch capabilities. VI.

USE CASES

A first use case addresses the dynamic interactions of several sub-networks (belonging to the same Network Operator or even to different Network Operators) at the edge. In the future, service requests by Users will determine the dynamic allocation of local processing and storage (realvirtual) resources and the dynamic set-up and tear-down of (real-virtual) connectivity links hooking together said resources. Services will be provisioned end-to-end over networks which will derive from the aggregation of subnetworks. Any one of these participant sub-networks is expected to have its own controller (which is in charge of controlling the sub-network, through certain architectures of controllers of lower level – these architectures are out of scope of the current investigation). In this context, the provisioning of services with a certain QoS will imply communication and interaction between several controllers. Each controller has to control its subnetwork and it has just a partial observations of the overall network over which the end-to-end services are provisioned. The exchange of information between controllers implies only partial observations, state estimates, or input values. Moreover there might be also constraints on the communication between controllers. The problem is the coordinated control of the overall network over which the end-to-end services are provisioned, i.e. the accomplishment of a stable consensus of the controllers of the sub-networks composing the overall network. The

73

Dynamics of value migration within the value chain are complex and difficult to estimate: in any case it is reasonable to admit that the overall value of a cognitive network is potentially much greater (both for the Operator and the Users joining actively by sharing their resources) than the one of the traditional architecture even taking higher manufacturing and consumer costs into account.

conditions should be extended in order to consider also parameters measuring the QoS of these links. Another improvement to this use case is related to the need to create a dynamic relation between the allocation of IT virtual resources and connectivity. More IT resources could means more data and possibly streams of data to be moved among the IT resources. They could require stringent QoS parameters for this fast transfer of data. Allocation of IT resources should go hand to hand with allocation of connectivity. The Network Operating system should guarantee the possibility to achieve the needed configuration allocating both IT and connectivity virtual resources.

Eventually, 0-Touch will enable both a new generation of privately owned and community networks (thus allowing a potential split of local and global connectivity costs) and new business opportunities in synergy with Internet of Things. Each actor in this value chain has to contribute resources and connectivity as well as in terms of cooperating logic, controllers and algorithms. And this opens a new way of providing services.

VII. OPERATORS AND THE 0-TOUCH APPROACH A lot of research activities (e.g., GENI, FIRE, AKARI) are in some degrees coping with the 0-Touch approach. Their proposition is towards a clean slate architecture.

ACKNOWLEDGMENT

From the Network Operator point of view, the availability of solutions supporting 0-Touch networks is fundamental for solving a network paradox that is hampering Operators to do business worldwide: the perimeterization of their network, i.e., the strong coupling of services and network resources. If an Operator has no network in a certain area, it cannot provide its services in that region. The de-perimeterization of the network (e.g., achievable through the mechanisms of Network Upload) allows also the dynamic composition of specific bound networks or the possibility to create special networks on top of physical network for serving specific communities or applications. In addition progress in cognitive networking could be beneficial also for the core network of Operators allowing the future infrastructures to be lean and highly efficient. This will lead to a reduction of OPEX. The Operators then could be more ready to cooperate at the edge of their network.

The Authors wish to thank Dr. Roberto Saracco for his support and insights. REFERENCES [1]

S. Dobson, S. Denazis, A. Fernández, D. Gaïti, E. Gelenbe, F. Massacci, P. Nixon, F. Saffre, N. Schmidt, F. Zambonelli, “A survey of autonomic communications”, ACM Trans. Auton. Adapt. Syst., Vol. 1, No. 2 (2006), 223-259. [2] A. Manzalini, P.H. Deussen, S. Nechifor et alii "Self-optimized Cognitive Network of Networks", Oxford Journals "The Computer Journal" (2010), http://comjnl.oxfordjournals.org/content/54/2/189.full.pdf [3] D. Clark, C. Partridge, J. C. Ramming, J. T. Wroclawski, “A knowledge plane for the Internet”, in Proc. of SIGCOMM 2003. [4] J. Gantz, "The Embedded Internet: Methodology and Findings, in IDC (2009). [5] P. Deussen, M. Baumgarten, M. Mulvenna, A. Manzalini, C. Moiso, “Component-ware for Autonomic Supervision Services - The CASCADAS Approach”, in Journal on Advances in Intelligent Systems, Vol. 3, No. 1&2 (2010), 87-105. [6] http://www.opennetworkingfoundation.org/ [7] L. Rodero-Merino, et al., “From infrastructure delivery to service management in clouds”, Journal on Future Generation Computer Systems 26 (2010), 1226-1240. [8] P. Ruth, J. Rhee, D. Xu, R. Kennell, S. Goasguen, “Autonomic Live Adaptation of Virtual Computational Environments in a Multi-Domain Infrastructure”, in Proc. Conf. on Autonomic Computing (2006), 5-14. [9] I. Foster, A. Iamnitchi, “On death, taxes, and the convergence of P2P and Grid Computing”, Peer-to-Peer Systems II, LNCS 2735, 118-128. [10] D. Oppenheimer, A. Ganapathi, and D. A. Patterson. Why Internet services fail and what can be done about these? In USENIX USITS, Oct. 2003.

The flexibility introduced by 0-Touch approach has to be fully understood and exploited. For this reason it is fundamental to start developments and experiments to fully grasp the potentiality of this novel approach. The power of edge networking will increase in such a way that potentially the Operators could be put in a corner and their networks to be less and less used in favour of connectivity offered by cooperating edge networks. However, Operators with their controlled and well balanced infrastructure could be a strong element for healing the increasing complexity (and even chaos) at the edge. Their role could be essential in order to implement a viable cognitive layer.

74